JUL 20 1935 


AMERICAN 
OURNAL OF MATHEMATICS 


FOUNDED BY THE JOHNS HOPKINS UNIVERSITY 


EDITED BY 


= E. T. BELL ABRAHAM COHEN 
4 CALIFORNIA INSTITUTE OF TECHNOLOGY THE JOHNS HOPKINS UNIVERSITY 
E. W. CHITTENDEN G. C. EVANS 
UNIVERSITY OF IOWA UNIVERSITY OF CALIFORNIA | 
F. D. MURNAGHAN 
THE JOHNS HOPKINS UNIVERSITY 


WITH THE COOPERATION OF 
MARSTON MORSE ALONZO CHURCH 


J. R. KLINE L. R. FORD 
E. P. LANE OSCAR ZARISKI 


HARRY LEVY 


FRANK MORLEY 
HARRY BATEMAN 
W. A. MANNING 


PUBLISHED UNDER THE JOINT AUSPICES OP 


THE JOHNS HOPKINS UNIVERSITY 
AND 


THE AMERICAN MATHEMATICAL SOCIETY 


Volume LVII, Number 3 
JULY, 1935 


THE JOHNS HOPKINS PRESS 
BALTIMORE, MARYLAND 
U. S. A. 


j 
AS 
q 
| 
| 
‘4 
4 
‘ 
4 
Hy 
é 


CONTENTS 


Cyclotomy, higher congruences, and Waring’s problem II. By L. E. 
DIcKsoN, 

The equivalence of non-singular pencils of Hermitian matrices in an 
arbitrary field. By J. WILLIAMSON, , 

On the rational canonical form of a function of a matrix. By Neat 
H. McCoy, . ; ; 

On certain types of hexagons. By J. R. MussEtMan, 

On the abstract properties of linear dependence. By HAssLER Watney, 

On the asymptotic distribution of the remainder term of the prime- 
number theorem. By AUREL WINTNER, 

On the exact value of the bound for the regularity ‘of solutions of ordi- 
nary differential equations. By AuREL WINTNER, . 

On symmetric Bernoulli convolutions. By RicHaRD KersuNner and 
AUREL WINTNER, 

On uniform convergence. By J. W. THEODORE ‘Suexav, 

On the momentum problem for distribution functions in more that one 
dimension. By E. K. Havitanp, 

A note on a property of Fourier-Stieltjes transforms in more than one 
dimension. By E. K. HAvILAND, 

The theory of the second variation for the non-parametric problem of 
Bolza. By T. Ret, 

Concerning some methods of best approximation, and a “theorem of 
Birkhoff. By I. M. SHErrzr, 

Groups containing five and only five squares. G. A. Mruizr, F 

Correction and addition to “complements of potential theory.” By 
GRIFFITH C. Evans, ; 

On a certain class of orthogonal ‘polynomials, By A. TARTLER, 

Metabelian groups and pencils of bilinear forms. By H. R. BRAHANA, 

A metrically transitive group defined by the modular —_ By Gustav 
A. HEpLunp, 

Some intrinsic and derived vectors in a a Kawaguchi space. By LG Syner, 

An analytic characterization of surfaces of finite Lebesgue area. Part I. 
By B. Morrey, JR., 


THE AMERICAN JOURNAL OF MATHEMATICS will appear four times yearly. 4 
The subscription price of the Journat for the current volume is $7.50 (foreigh 
postage 50 cents); single numbers $2.00. \ 
A few complete sets of the JoURNAL remain on sale. ; 
Papers intended for publication in the JouRNAL may be sent to any of the Edita r 
Editorial communications may be sent to Dr. A, CoHEN at The Johns Hopkin 
University. 
Subscriptions to the JouRNAL and all business communications should be sent t0 
Tue JoHns Hopkins Press, BALTIMORE, ManryLanp, U.S.A. 


Entered as second-class matter at the Baltimore, Maryland, Postoffice, acceptance for mailing at 6p ' 
rate of postage provided for in Section 1103, Act of October 8, 1917, Authorized on July 8, 1918 | 


PRINTED IN THE UNITED STATES OF AMERICA 
BY J. H, FURST COMPANY, BALTIMORE, MARYLAND 


415 
491 
503 
509. 
534 | 
539 | 
| 
549. 
562 
569 
518 
587 
623 
627 
645 
668 
679 
69 
ki 


| 
i 
2 


¢ 

% 

¥ 


CYCLOTOMY, HIGHER CONGRUENCES, AND WARING’S 
PROBLEM II.' 


By L. E. Dickson. 


Part 2. THE WARING PROBLEM FOR POLYNOMIAL SUMMANDS. 


29. Introduction and summary. In 1770, E. Waring conjectured that 
“every positive integer NV is a sum of 9 integral cubes = 0, also that N is a sum 
‘of 19 fourth powers, etc. Hardy and Littlewood proved that every sufficiently 
large N is a sum of 

(160) 444546, 


© integral k-th powers = 0, where & is the greatest integer < the quotient of 
log 2—logk + log (k—2) by logk—log (k—1). [Except for 
Fyery small values of &, {, is quite small compared to (160); for example, 
= 50, fos = 493. 

: Waring conjectured also that every N is a sum of a limited number of 
: values of a polynomial in x of degree k. In precise form, this was proved by 
/E. Kampke.? But neither writer gave any information as to the number of 
‘values needed. For = 3, 9 values suffice. For =4 the analytic part 
of the proof that s values of a polynomial suffice for a large N has been made 
/by Miss Humphreys.* We here treat the second part of the proof, viz., that 
if A is any integer and p is any prime not dividing ® k, then A is congruent 
| modulo p toa sum of n values of the polynomial, where n <s in (160). We 
"find that 


8 9 19 41 87 192 425 
If k is one of the even numbers 6, 8,- - -, 18, then 


n=n(k) S8- 


1 hich is less than the first term of (160) since 3 < 4. 


* Part I of this paper appeared in the current volume of this JouRNAL, pp. 391-424. 

*Mathematische Annalen, Bd. 83 (1921), pp. 85-112. 

* Dickson, Transactions of the American Mathematical Society, vol. 36 (1934), 
ip. 1-12, 739-741; R. D. James, American Journal of Mathematics, vol. 56 (1934), 
303-315. 

“Duke Mathematical Journal, vol. 1 (1935). 

*For the case when p divides k, see § 45. 


eo 4 6 12 24 48 72 144 216 

949 2113. 

463 


L. E. DICKSON. 


For any odd k, n(k) S2n(k—1), whence n(k) is much less than the 
first term of (160) when n=7,9,---,19. If we write s=S + &, then 
k | 20 22 24 26 


n 384,912 1,154,736 57,736,800 173,210,400 
8 4,718,617 20,971,547 92,274,717 402,653,215. 


Hence n < s fork < 28. For n = 28, s exceeds a billion, and the actual value 
of s is of slight interest beyond the fact that there exists an s (Kampke). 


30. Normal polynomials. We may exclude polynomials g(x) whose 
values for integers x are all multiples of p, since integers not multiples of p 
are not represented as a sum of values of g(x). The true Waring problem 
relates to summands g/p and not to summands g. 

By the degree of g(x) modulo p we mean the exponent of the highest 
power of z whose coefficient c is prime to p. We seek n such that 


(161) A= g(a) (mod p) 


has integral solutions x; for every integer A. We desire that the same n shall 
serve not only for every A, but for every polynomial g(x) of given degree 
k modulo p. We write n=n(k) = n(k, p). 

Determine d by cd==1(modp). Then dA ranges with A over a com- 
plete set of residues modulo p. Hence the problem for (161) reduces to that 
for dg(z), whose c is ==1(modp). If k is prime to p, we take r= XY +2 
and choose z so that the coefficient of A** is divisible by p. If C is the 
constant term of g(x), write g—=H-+C. Then (161) is equivalent to 
A—nC= 3H. 


THEOREM 13. When k is not divisible by p, the problem for (161) 
reduces to the like problem for a NORMAL polynomial whose leading coefficient 
is unity, while the coefficient of x*-* and the constant term are both zero. 


Lemma 1. If netther r nor s is divisible by p > 2 and if A is any integer, 
there exist solutions of 
rz? + sy? =A (mod p). 


This special case of Theorem 6 is evident since each of ra? and A —sy’ 
takes 1 + 4(p—1) incongruent values and hence have a common value. 
When & = 2 the only normal polynomial is 27. Then Lemma 1 with 
r=§ = 1 gives 
(162) n(2,p) = 2, p > 2. 


CYCLOTOMY, HIGHER CONGRUENCES, AND WARING’S PROBLEM. II. 465 


31. Odd k. Let k be not divisible by p. By choice of z, g(a +2) 
becomes 


k 
f(z) => ea, c, not divisible by p. 
i=0 


f(a) + f(— 2) = + +--+ — H(z). 


Take p> 2. Then 2c, is prime to p. By definition, any A is congruent to a 
sum of n(k—1,p) values of H(z). 


THEOREM 14. If k is odd and not dwisible by p> 2, then 
n(k, p) S 2n(k—1, p). 
1(3,p) S4 if p> 3. 
32. Case k =4. We employ the fact that the sum of the fourth powers 


of a+ d, a—d and — 2a is 2t?, the sum of their squares is 2¢, and their 
sum is zero, where = 3a?-+ For p> 2 every normal polynomial of 
degree 4 is of the form f(z) = a* + 2ua*+ vx. Hence 

+ 2t- 2u—f(a+d) +f(a—d) + f(—2a). 


The left member is 2(y? — u?), where y=¢- u. Employ also a second such 
identity and add the two. Hence N = 2(y? + 2?— 2u’) is a sum of six 
values of f(x). 

Take p > 3 and apply Lemma 1. Thus integers y and z may be chosen 
so that N takes any assigned value modulo p. For the ¢ determined by y, 
3a? +- d?==¢ (mod p) is solvable. Similarly for + determined by z, 3a? + 8? 
(mod p) is solvable. This proves 


(163) n(4,p)S6 if p>3. 
Hence by Theorem 14, 
(164) n(5,p) S12 if p>5d. 


33. LEMMA 2. For every integer A, there exist * integral solutions of 
(165) =A (mod p), 
4=1 
‘where r is the g.c.d. of k and p—1. 
Taking A = — 1, —1, we obtain 


*Landau, Vorlesungen iiber Zahlentheorie, I (1927), p. 290. The proof is quite 
elementary. 


) 

| 


466 L. E. DICKSON. 
Lemma 3. For K =r there exist solutions of 
(166) hie =0, not every hi =0 (mod p). 
i=1 


34. Even k. Let k be not divisible by p> 2. As in § 30, it suffices to 
consider a polynomial of the form f(z) =2*+a**+---. By (166), 


If C is not divisible by p, we have the result desired. Next, let C=0. By 
(166) a certain h; is prime to p. Let Q(y) be derived from P(y) by changing 
the sign of h;. If also the leading coefficient of Q(y) is divisible by p, 
evidently = 0, k; =0 (mod p), contrary to hypothesis. 


THEOREM 15. Let k be even and not dwwisible by the odd prime p. Choose 

K (Sr-+1) so that (166) is solvable. Then 
n(k, p) S Kn(k—1, p). 

35. Lemmas, chiefly on congruences. 

Lemma 4. If q is prime to p—1, every integer is congruent .to a q-th 
power modulo p. 

Since there are integral solutions of v(p—1) +1— ug, 

Lemma 5. Let r be the g.c.d. of k and p—1. If each a, 1s prime to p, 
(168) ++ +--+ agrs* =c (mod p) 
has the same number of solutions as 
(169) +: ++ asys"=c (mod p). 

Consider any solution of az, + - - - + Ge% == c (mod p). For 
(t—1,---,s), we shall prove that 7*==2, and y;" =z, (mod p) have the 
same number of roots. This is evident unless z; is prime to p. Then 

k Ind a; = Ind z (mod p—1) 


has no root or r roots 2; according as Ind 2; is not or is divisible by r. The 


same is true for r Ind y; = Ind %. 
From Theorem 6 and Lemma 5 we obtain 


Lemma 6. If p==—1 (mod 4) and tf q is prime to p—1, there are 
exactly p+-1 solutions of 


CYCLOTOMY, HIGHER CONGRUENCES, AND WARING’S PROBLEM. II. 467 


ak + y*¥==—1 (mod p), k= 2", m=21. 
The conditions are satisfied if p==—1 (mod 12), = 3". 
LemMA 7%. Jf p==1 (mod 4) and if q is prime to 4(p—1), there are 


exactly p—1 solutions of 


+ (mod p). 


Lemma 8. If p > 2, there are at most km simultaneous" solutions of 


(170) H*¥=—1, h™+ H"=—1(modp), m =k. 


Let d be the g.c.d. of k and m. Comparing (— H*)™/4 and (— H™)*/4, 
we get 
(171) (hk +1)m™/¢4 (h™ + 1)*4=0, 
which is not identically =0. It has at most km/d roots. Since d is a linear 
combination of k, m, we see from (170) that H*@ is congruent to a poly- 
nomial in h. 

LEMMA 9. n(k, S p*—1. 

We exclude polynomials f(z) all of whose values are multiples of p. As 
in § 30, we may assume that the constant term of f(x) is zero. Let v denote 
a value prime to p of f(z). 

I. If A is any integer prime to p, tv=A (mod p’) has a solution 1, 
1St< p*. Thus A is congruent to a sum of ¢ (equal) values of f(z). 


II. If A=0 (mod p*), then A =f(0) (mod 
III. Let 1S m<e,a prime to p. By I, 


a=S+2p°, S =sum of —1 values of f(z). 

Multiply by p”. Hence A is congruent modulo p* to 
p™S = sum of p"™(p?™—1) < values of f(z). 
36. Case k=6. We shall prove that n(6, p) S 24 if p> 3. 


I. p=1(mod4). Then h?=—1(modp) is solvable. Hence 
W+1==0, Thus K = 2 in (166) and n(6, p) S2-12 by Theorem 15 and 
(164) if p>5. For p=5, use Lemma 9. 

II. p==—1(mod12). By Lemma 6, there are exactly p+1 solu- 
tions of 


"If m/d and k/d are both odd, there are at most 25 + km—3dm simultaneous 


solutions, where 5 is the number of roots of z@ =—1 (mod p). 


468 L. E. DICKSON. 


(172) hy? + hoo +he®=0(modp), he—1. 


If p+1> 24, Lemma 8 shows that (172) has a solution for which 
hy* + ho*s4—1, and one for which h,?+h.254—1(modp). But if 
p=11 or 23, n(6, p) S 22 by Lemma 9. 

Consider a normal polynomial 


(173) f(z) + 
Then 


(174) P(y) Zf(uy) (mod p), — 


If P(y) is not identically =0 (mod p), n(6, p) S 3n(4, p) =18. Next, let 
P(y) be identically =0 for all solutions of (172). Since (172) holds also 
when h,; = —1, there are solutions with M; 0, whence c;==0. Similarly, 
M,¥0, c,=0. We saw that there are solutions with M,¥0, 
whence c.=0, c,==0. Hence f(z) =27* and n(6,p) =2 by Lemma 2 or 


Lemmas 1 and 4. 


Lemma 10. Ezcept for p—7, 31, 67, 79, 139, 223, there exist solutions 
of h* + H*=—1 (mod p), tf p=7 (mod 12). 
Since p= 6f +1, f is odd and the number WN of solutions is 36(0, 3) 
by Theorem 5. By § 19, 
N=p+1+16A4, N=p+1+10A+12B, p—A?+ 3B’, 


according as 2 is or is not a cubic residue of p. The sign of A was there 
chosen so that A==4(mod6). By Theorem 7, B is a multiple 3y of 3 in 
the first case; but in the second case, B is prime to 3 and we may choose the 
sign so that + B=A (mod 8). 

Let N =0. Eliminate p. In the first case, 


7 = {4(A + 8)}? + = 4+4 3,4 =—2 or — 14, B? = 9, p= 31 or 223. 


In the second case, p and 37 are the products of A + BY —3 and 
5— 2\V—3 by their conjugates, whence, by multiplication, 
3837p = X?+ 3Y?, X —5A + 6B=2 (mod 6), —=— 2A + 5B, 
and Y = 3w, X + 37 = 3v, v odd. If N=0, p=—1— 2X, 
148 = v? + 3w? 121+ 3-9 or 143-49, 
whence p = 7, 139 or 67, 79. 


CYCLOTOMY, HIGHER CONGRUENCES, AND WARING’S PROBLEM. II. 469 


Ill. p=? (mod 12). First, exclude the six p’s in Lemma 10. Then 
(172) is solvable. There exists an integer e belonging to the exponent 
6 modulo p. Hence (172) holds also when h, is replaced by eh,. Hence 
(172) has solutions for which M;40, c;=0 (j=—1,:-:,4 in turn) 
in (174). 

We have the following solutions of (166) with K —4, k =6. 


p= 67, 1+1i+ 1+ 2=0; 

p= 9, 1+1+10+ 67=0, 67 =9"%, g = 29; 
p= 139, 1+1+ 64+131=0, 6= 9°, 131=g9"%, g=92; 
p = 223, 1+4+ 8+210=0, 210=—10*. 

For each such p, n(6, p) = 4-6. 


Lemma 11. Jf p=1(modk) and if every integer is congruent modulo p 
toa sum of s k-th powers, then n(k, p) S sk. 


There exist & roots hy of h* =1, whence 3hif =0 (mod p) forl Sj <k 
by Newton’s identities. We may take f(r) =a*-+---. Then 


k 
f(hiy) == ky* (mod p). 


Since r is now k& in Lemma 2, we have 
LeMMA 12. If p=1(modk), n(k, p) Sk’. 


For k =6, p = 31, we find that s = 4 in Lemma 11, whence n(6, 31) S 24. 
For p=, apply Lemma 9. 


THEOREM 16. If p> 3, n(6, p) S 24. 
3%. Case k—8. Proof that n(8, p) S 72. 


I. p=—1(mod4). By Lemma 6, there are exactly p+ 1 solu- 
tions of 
(175) h§ + H* =—1 (mod p). 


By Lemma 8, (175) has at most 48 solutions in common with one of 
M+ 1, h*+ Hte=—1; h? ++ H*e=—1. Hence if p+1> 48 
there is a solution of (175) for which M, 0, one for which M,0, one for 
which M,540, where the notations refer to (174) with (j=1,---,6), 
whence n(8, p) S 3n(6,p) S72. For p+148, we have n(8, p) S 46 
by Lemma 9. 

II. p=1(mod8). Then n(8, p) = 64 by Lemma 12. 


470 L. E. DICKSON. 


Lemma 13. Let p=4f+1—<2* + 4y?, c=1(mod4). The number 
N of solutions of h* + H*==—1(mod4) is p—3—6z tf f ts even, but 
p+1— 6a af f ts odd. 


For, by Theorems 2 and 5, N =8 + 16(00) or 16(02). Apply (52) 
and (56). 

III. p=5(mod8). If in Lemma 13, p+1—6r=—0, eliminate p 
from 2?+4y?—p. Thus z=—3(x—3). Hence z= +1, 
or 5, or 29. 

Let p~5, p29. Then h* + H*=—1 has solutions. By Lemma 5 
it has the same number of solutions as (175). Thus 


(176) h® + + =0 (mod p) 


has solutions with hz prime to p. There exists an integer e belonging to the 
exponent 4 modulo p. We have (174) with (j7—1,---,6). Let the new 
(174) be identically =0 (mod p) for all solutions of (176). ‘As below (174), 
f(z) involves only even powers of z We do not alter (176) if we replace 
hs by ehs. If the old Mz is =0, the new M, is 0, whence c.=0. Similarly 
Hence f(z) c,z*. Employ 


4(a + b)* + 4(a—b)* + (2a)* + (2b)* = 24(a? + b?)?. 
The corresponding sum of eighth powers is 8S, where 
S = 33(a® + b*) + 28(a°? + + 70a*b*. 


We may take a1, b®>=—1(modp). Then S=80. Hence there exist 
solutions of 


10 
> hit =0, > hi =8 XK 80 ¥0 (mod p). 
é=1 


Then f(hiy) =640y*. In Lemma 2, r is now 4. Hence every integer is 
congruent to a sum of 4 X 10 values of 2* + c,2*. 
For p = 29, n(8, p) S 28 by Lemma 9. 


THEOREM 17. n(8, p) S 72. 
38. Case k = 2q, a prime > 3. 


I. p=1(modq). Then n(k, p) Sk? by Lemma 12. 
II. p#1(modq). By Lemmas 6, 7, there are exactly p+1 solu- 


tions of 
(177) +- H* =—1 (mod p), 


CYCLOTOMY, HIGHER CONGRUENCES, AND WARING’S PROBLEM. II. 471 


according as p===1(mod4). Apply Lemma 8 with m<k, Thus 
n(k, p) S 8n(k —2, p) if p+1 > k(k—2). 


THEOREM 18. Jf p=7, n(10, p) S 216. 
This was proved if p+1> 80. But if = 79, apply Lemma 9. 
39. Case k = 12. To prove that n(12, p) S 648. 
I. p=1(mod12). By Lemma 12, n(12, p) S 144. 
II. p=—1(mod12). By Lemma 6, 
(178) hi? + H'? =—1 (mod p) 


has exactly p+1 solutions. If p+1> 120, Lemma 8 shows that 
n(12, p) S 3n(10, p) = 648. If p+1=120, apply Lemma 9. 


III. p=5(mod12). Since every integer is congruent to a cube 
(Lemma 4), the number WN of solutions of (178) is the same as the number 


of solutions of 
z* + wt==—1 (mod p), 
which is true also by Lemma 5. Here p=5 or 17 (mod 24). 


IV. p=5 (mod 24). By Lemma 13, N = p+ 1— 62, x=1 (mod 4), 
et 4y?—yp. Thus VN =0 (mod 24). Since — 2 is a quadratic non-residue 
of every prime p==5 (mod 8), z2*w* in (179). Also, w*==—1 implies 
w=1, wt=1, whence 240, w¥0 in (179). But z*==1 has 
four roots. Hence N is a multiple of 2 x 4x 4. Hence N is divisible by 96. 

If VN > 12 & 10, Lemma 8 gives n(12) S 3n(10). It remains to treat 
N=0,N By III, § 37, N =0 requires that p—5 or 29. If N= 96, 
eliminate p= 6x + 95 from 2? + 4y?=—~p. Hence 


26 =v? + v = odd, v=+5, +1, 
p= 58, 101, 173(125). Apply Lemma 9. 


V. p= 17 (mod 24). By Lemma 13, N = p— 3 — 62. Hence 
N=8 (mod 24). Since there exists a number e belonging to the exponent 8, 
e‘=—1(mod4) has four roots. Thus (179) has eight solutions with 
4=0 or w=0, and hence has N —8 solutions z, w both prime to p. If the 
quadratic residue 2 is a residue of a fourth power, there are four solutions 
of 22*==— 1 and hence 16 solutions of (179) with 2*==w*, whence N — 24 
isa multiple of 2 4 4. This with N=8 (mod 24) gives N =56 (mod 96). 
Next let 2* 4 w*, then N —8 is a multiple of 32. Hence N=8 (mod 96). 


472 L. E. DICKSON. 


By Lemma 8, it remains to treat N = 120, whence N = 8, 56 or 104. Elimi- 
nate p and write v = 4(2— 3) = odd. 


If N=8, 5=v?+y, v=2+1, p=17 or 41. 
If N= 56, 17 =v? v= +1, p=89, or 53417 (mod 24). 
If N ~104, 29 =v? + v= +5, p=65 or 185 (not primes). 


For these primes apply Lemma 9. 


VI. p=7(mod12). By Lemma 5, the number WN of solutions of 
(178) is the same as that of 2°+ y°=—1. If N=0, Lemma 10 gives 
p = 223, whence Lemma 9 applies. Let NV > 0. Since there exists an integer 
belonging to the exponent 6 modulo p, the usual proof gives n(12) = 3n(10) 
unless f(z) = + 

Suppose that (178) has a solution in common with 


(180) h® + H*=—1 (mod p). 


Elimination of h® gives + H® + 1=0, whence H*1. Thus the g.c. d. 
of the exponents in H**=1, H?-*==1 must exceed 6. Hence p=19 (mod 36), 
Conversely, H’? + H® + 1=0 then has twelve roots. Write h = tH?, ®=1. 
Then (180) holds and there are exactly 72 simultaneous solutions of (178) 


and (180). 

It therefore remains only to treat the case N = 72. By the results below 
Lemma 10, V=p+1+16A or p+1-+2X. Eliminate p. In the first 
case, 15 =z? + 3y?, z—=4(— A—8), which is impossible. In the second 
case, 87-12 =v? + Thus v= 3u, 148 —3u?+w?. The only solu- 
tions are (u?, w?) = (9,121), (16,100), (49,1). The ps are 19, 199, 271, 
91 and 217 (factors 7), 735419 (mod 36). Apply Lemma 8. 


THEOREM 19. If p=", n(12, p) S 648. 


40. Case k=—14. If p+1> 14 X 12 = 168, § 38 applies. If 
p +1168, Lemma 9 applies. Hence 


THEOREM 20. If p=, n(14, p) = 1944. 
41. Case k=16. To prove n(16, p) S 5832. 
I. p=1(mod16). By Lemma 12, n(16, p) S 256. 


Il. p=—1(mod4). By Lemma 6, with g —1, there exist exactly 
p+ 1 solutions of 


(181) +- H**==—1 (mod p). 


CYCLOTOMY, HIGHER CONGRUENCES, AND WARING’S PROBLEM. II. 473 


By Lemma 8, n(16, p) S3n(14, p) unless p + 1S 224, and then Lemma 9 
applies. 

III. p=5(mod8). By Lemma 5, (181) has the same number N of 
solutions as z* + y*==—1(modp). By Lemma 13, 


N=p+1—6z, p + 4y?, «==1 (mod 4). 


Since —2 is a quadratic non-residue of p, h**54H** in (181). Also 
hx#0,HA0. If then j*#=1. Hence the solutions fall into sets 
of 2X 4 & 4, so that N is a multiple of 32. There remains the case N S 224. 
If N =1 (mod 8), p would be divisible by 3. 

If N 0, p=5 or 29 by III of § 37. If N= 96, p = 53, 101 or 173 by 
IV of §39. Write v—4(¢—3) =odd. If N= 32,10 ¥’, p= 13, 
37,61. If N = 128, 34=—v?+ y?, p= 109 or 181. If VN = 192, 50 =v? + y’, 
v= 1, 25, 49, p= 149, 197, 269, 293. If N = 224, 58 =v? + p= 157 
or 277. For these p’s apply Lemma 9. 


IV. p=9(mod16). By Lemma 5, (181) has the same number M of 
solutions as 
(182) h§ + H® =—1 (mod p). 


By Theorem 5, M = 64(04). By (114), (115), 


(183) M=—p+1—182 or M—=p+1+ 67+ 240, p=2? + 4y?, 
p =a? + 267. 
First, let M > 0. Since there exists an integer belonging to the ex- 
ponent 8, n(16) = 3n(14) unless f(z) =2z'*+ cz*. As in VI, § 39, if there 
be a simultaneous solution of (181) and (182), then 


H“+H8+1=0, H%=1, H®*—1(modp), p=25 (mod 48). 


Conversely, there are then exactly 16 X 8 128 common solutions. Hence 
there remains only the case M128. Then in (183,), p—=182 + 127, 
whence 52 is the sum of the squares of 4(z—9) and y, viz., 36 and 16, or 
Vice versa. Thus p = 73 or 433 ==1 (mod 16). Next, if M 128 in (183.), 
then 


p+1—30Vp< 128, p< 1156. 


- But for p= 73 or p < 1156, Lemma 9 applies. 
Second, let M0. Evidently p < 1156. 


THEOREM 21. If p>, n(16, p) S 5832. 
42. Case k =18. Let p=—1(mod3). By Lemma 6, with q=9, 


474 L. E. DICKSON. 


(184) + =— 1 (mod p) 


has exactly p+ 1 solutions. By Lemma 8, if p + 1 > 288, n(18, p)S 3n(16, p). 
Apply Lemma 9. 

For p > 2, there remains the case p=1(mod6). If p==1 (mod 18), 
apply Lemma 12. Henceforth, let p==7 or 13 (mod18). By Lemma 5, 
(184) has the same number N of solutions as 2° -++ y®*==—1(modp). By 
Lemma 8, there remains the case VN = 18-16 = 288. 


In the respective cases below Lemma 10, 


p+1S—16A + 288 < 16Vp+288, p< 729, 
p +15 288 — 2X S 288 + 2/37), p = 580. 


Apply Lemma 9. Hence we have 
THEOREM 22. If p=, n(18, p) = 17496. 


43. Case k=20. When p#1(mod5), p==—1 or +1 (mod 4), 
we find by Lemma 6 or Lemma 13 the number WN of solutions of h?° + H”® 
==— 1 (mod p), and proceed as usual. If p==1 (mod 20), apply Lemma 12. 
There remains only the case p==11 (mod 20); since N is not known, we 
resort to the rough Theorem 15 and Theorem 14 and obtain © 


THEOREM 23, (20) =11n(19) S 22 (18). 


44. For k = 22, we employ §38 with g=—11. It remains to treat 
p13 440; apply Lemma 9. 
THEOREM 24. n(22,p) = 3n/(20, p). 


45. It remains to treat primes p which divide k. Let p* be the highest 
power of p which divides k. Write 


P =p" if p> 2, P if p=2. 


We seek N such that every integer is congruent modulo P to a sum of V 
values of any polynomial in z not all of whose values are multiples of p. 
By Lemma 9, N=P—1. Hence if k—3, N=8; if k= 4, 
if k=5, NS 24; if k—=6, N=8. For these, in (160). For 
TSkS=26, N <n—n(k), for the n listed in § 29. 


THE UNIVERSITY OF CHICAGO. 


THE EQUIVALENCE OF NON-SINGULAR PENCILS OF 
HERMITIAN MATRICES IN AN ARBITRARY FIELD. 


By J. WILLIAMSON. 


The problem of the equivalence of two non-singular pencils of real sym- 
metric matrices in the real field was first solved by Muth.t More recently 
Trott,? Wegner,® Ingraham * and Turnbull® have solved the similar problem 
for two Hermitian matrices under conjunctive transformations in the complex 
field. The notation used by Trott was such, that he was able to discuss the 
Hermitian case and at the same time the real symmetric case. In this paper 
we show how 'Trott’s method may be extended to the similar problem of the 
equivalence of two non-singular pencils of Hermitian (or symmetric) matrices 
with respect to a general commutative field K. Incidentally, as is often the 
case with a generalization, we show why the results in the case of the complex 
field (or real field) are comparatively simple. We prove that a necessary and 
sufficient condition for two such pencils to be equivalent is that; 


(a) they have the same elementary factors with respect to K, 
and (8) certain diagonal matrices be equivalent in over fields of K. 


In the simple cases already considered conditions (8) can all be expressed 
in terms of the equality of certain integers—the signatures of the respective 
quadratic or hermitian forms. That no such great simplification is possible 
in the general case is apparent from a consideration of two pairs of one rowed 
matrices a, b, and c, d in the rational field, where a, b, c, d, are all rational 
numbers and b and d are both different from zero. The pair a, b, is equivalent 
to the pair c, d, if, and only if, a—Ab and c—Ad have the same elementary 


*P. Muth, “Uber reele Aquivalennz von Scharen reeler quadratischer Formen,” 
Orelle’s Journal, vol. 128 (1905), pp. 302-343. 

*G. R. Trott, “On the canonical form of a non-singular pencil of Hermitian 
matrices,” American Journal of Mathematics, vol. 56, no. 3 (1934), pp. 359-871. We 
shall refer to this paper as Trott, 1. 

*K. W. Wegner, “ Equivalence of pairs of Hermitian matrices,” Bulletin of the 
‘American Mathematical Society, vol. 40, no. 1, January (1934), Abstract 103. 

*M. H. Ingraham, “The singular case of the equivalence of pairs of Hermitian 
matrices,” Bulletin of the American Mathematical Society, vol. 40, no. 7, July (1934), 
Abstract 242. 

5H. W. Turnbull, “Pencils of Hermitian forms,” Proceedings of the London 
Mathematical Society, series 2, vol. 39 (1935), pp. 232-248. 

475 


| 
| 
t 
t 


476 J. WILLIAMSON. 


divisors, i. e., if a/b = c/d (condition «), and, if b =k?*d, where k is a rational 
number (condition £). 

Section I is devoted to preliminary definitions and proofs; the main 
results are proved in § (2) and a short discussion of these results is given in 
§ (3). No attempt is made to consider a similar problem for singular pencils, 

1. Let K be any commutative field of characteristic zero ® and let K (i) 
be a quadratic field over K, where 1 is a root of the equation z?—a—0, 
irreducible in K. Then every element a of K (7) is of the form a = a, + ia, 
where a, and a, lie in K, so that the conjugate of a is the element @ = a, — ia,, 
If R is a matrix over K(1), R = R, + iR.2, where F, and R, are both matrices 
over K, and R—R,—ik,. The matrix R* is defined to be the conjugate 
transposed of FR so that 


When PF is a square matrix of order n, we may consider Ff as a matrix 


of matrices and write 


(1) 


where is a matrix of rows and r; columns and =n. 
If 8 is a second n-rowed square matrix and S is written as a matrix of matrices, 


R = (Ri;), (1,7 =1,2,- *,t), 


(1,7 = 1, 2,° 


S= (Si;), 


(2) 


where Sj; is also a matrix of r; rows and 7; columns, we say that § and F are 
similarly partitioned or that (2) is a partition of S:similar to (1). If in (1), 
when i is different from j, Ry; is the zero matrix, we call R a diagonal block 


matrix and write 


If D is a square matrix of order n, whose elements lie in K, the invariant 
factors L;(A) of D—XE are polynomials over K. We call? the powers of 
the distinct irreducible factors of #;(A) the elementary factors (with respect 
to K) of D—AE. Let the elementary factors of D—AE be 


[pi(A) ]™, 


*It is not essential for this discussion that the characteristic p of K be zero. On 

the other hand p cannot be arbitrary. We, however, restrict ourselves to the case 
p = 0 for the sake of simplicity. 

™Cf. Neal McCoy, “On the rational canonical form of a function of a matrix,” 
American Journal of Mathematics, this volume, p. 492; J. H. M. Wedderburn, Lectures ( 


on Matrices, pp. 123-126. 


| (3) 
| 

& 


NON-SINGULAR PENCILS OF HERMITIAN MATRICES. AU? 


where pi(A) is a polynomial over K of degree nj, irreducible in K, with leading 
t ki 


coefficient unity and such that pi(A) ~ pj(A), if it Aj. Thenn=> Dd mj. 


4=1 j=l 
Further let p; be a square matrix of order ni, with elements in K, whose 
characteristic equation is pi(A) —0, and let Ni; be the matrix 


a 
0 


0 j=1,2,---, ks), 
0 
where e; is the unit matrix of order n; and N;;, considered as a matrix of 
matrices, is of order 7;. If M; is the diagonal block matrix 


(5) M,; = [ Nis, - Nix, 
and 
(6) M =[M,, -, Mi), 


the elementary factors of D —AE are the same as those of M—dAK. Hence 
M is similar to D in K and is a canonical form of D in K. 

We now define two matrices of order 7i;, whose elements are matrices of 
order n;. These two matrices are the auxiliary unit matrix 


0’) 
0 
(1) 


0) 


and the counter unit matrix 


(8) 


The matrix N;;, defined by (4), may accordingly be written in the convenient 
- form 
= + Vas, 


where F,; — 7T?,;. Moreover a simple calculation shows that 


(9) 45 = 


| 
(t—1,2,---,¢; 
| 

t 

0 i 0 
T= : 
C4 0 


478 J. WILLIAMSON. 
Let qi be a non-singular matrix over K of order nm, satisfying the equation 


(10) = 
It has been shown that such a matrix q; exists and that it is necessarily 


symmetric.* Further, if, 


then 
Qil Nig = is (piles + Vis), 
= + by (9) and (10), 
so that 
(12) Nis = 
Accordingly the matrix 


(13) Q=[@1,Q2,° Qe], where Qi = Qu], 
is a non-singular symmetric matrix over K, satisfying the equation 
(14) QM = M’Q. 

Moreover, if is a matrix over K (1), such that 


(15) RM = M’R, 
then 
(16) R—QS, 


where S is a matrix commutative with M. The form of § is known.® In fact 
S is a diagonal block matrix 


partitioned similarly to M in (6). Further, if for simplicity we write 
nj, 'j, Uj, 9, p, and k for mij, Tij, Uij, qi, pi and k; respectively and let 


(18) (Sys), (r,s =1, -,k), 


be a partition of S; similar to that of M; in (5), Sr. is a matrix of y, rows 
and 7 columns, where y, = 7s, if rs. Moreover, if rSs, 


(19) Ser (0'Gsr), 


*R. C. Trott, Bulletin of the American Mathematical Society, vol. 41, no. 1, part 2, 
January (1935), Abstract No. 95. We shall refer to this paper as Trott 2. 
Trott 2. 


| 


NON-SINGULAR PENCILS OF HERMITIAN MATRICES. 479 


where G,s and Gs, are both square matrices of order ys =, while 0 denotes 
the zero matrix of orders »r—», 7 and 0’ its transposed. More exactly, 


n-1 
(20) Gee = ps 


a=0 


where rsa aNd Ysera are polynomials in the matrix p with coefficients in K(i). 
We now define the two matrices 


(21) Bis (0, Gre), Ser = 8, 


so that in particular, if yr = na, 
It should be noted that S,. is formally the transposed conjugate of S;s, if p is 
considered as an indeterminate instead of a matrix. 
It follows from (20) that 


n-1 
G*,sqT's == g* reaU 
a=0 
1 


GreaU by (9) and (10), 


Hence, if rs, 
= (G*,s 0)qT,, (0 the zero matrix of orders nr ys, ns) 
= (0 = (0 qT sGre) = Gre) = qT by (21). 


Similarly, 


0 0 
8 ( 0 ) q 
Therefore for all values of r and s 
(23) 8*,.qT, = qT Ses. 


If the matrix FR, defined by (16), is such that R = R*, on equating 
corresponding elements of the two matrices we have 


ql Srs = (qT'sSer)*, 
or 
(24) QT = 8* = QT r(Ser) by (23), 
80 that 
(25) Sre = Ser. 


| 
7-1 
SC“. 
a=0 
a=0 
sGrs. 
t 
2 


480 J. WILLIAMSON. 


In particular, if 7, = ys, it follows from (22) that 


Srs Ser, 


k) 


(26) 
and, if r = s, that 

(27) Srp = Serr, (r= 1,2,---,k), 
so that S,, lies in K. 


2. We now consider two square matrices, A and B, of order n, with 
elements in the field K(1), of which the second, B, is non-singular. The 
matrices A and B are such that A = A* and B = B*, so that both matrices 
are hermitian matrices or else, when A and B are both matrices over K, 
symmetric matrices over K. Moreover, if A and B are both matrices over K, 
in the sequel every matrix P is a matrix over K and P* is to be interpreted 
as P’. Since A = A* and B = B*, the invariant factors H;(A) of the pencil 
A — 2B, which are certainly polynomials over the field K(1), are unaltered 
by the substitution of —1 for 7 and are accordingly polynomials over K. We 
are therefore at liberty to talk of the elementary factors (with respect to K) 
of the pencil A—AB. We let these elementary factors be the polynomials (3), 
so that A—AB has the same elementary factors as M—AF. Since the 
elementary factors of A —AB are the same as those of AB-! — AL, the two 
matrices AB” and M are similar. Hence there exists a non-singular matrix P, 


such that 

P(AB—AF)P = M — AE, 
or 
(28) (A —AB) = P(M—AE). 


In general the elements of the matrix P lie in K(t), but, if A and B are both 
symmetric matrices over K, P is also a matrix over K. As a consequence of 
(28) we have 


(29) P*B"(A —AB)B'P = R(M—AE), 
where 
(30) R = P*B"P. 


It follows from (30), since B* = B, that R = R* and from (29) that 
RM = P*B"AB"P, so that RM is hermitian, and accordingly that 


(31) RM = M*k* M’R. 


AE) by a conjunctive 


We shall now reduce the pencil of matrices R(M 


NON-SINGULAR PENCILS OF HERMITIAN MATRICES. 481 


transformation *° to a canonical form G(M—AE) ; that is, determine a non- 
singular matrix W such that 


(32) W*R(M—AE)W =G(M—AE). 


Accordingly, as a consequence of (29) and (32), the pencil A —AB is 
equivalent under a conjunctive transformation to the pencil G(M—AE). 
Moreover, it follows immediately from (32) that G—W*RW and that 
W*RMW = GM = W*RWM. Hence G = G* and 


(33) MW = WM, 


so that throughout the various stages of the reduction the transforming 
matrices are all permutable with M. 

As a consequence of (31) R = QS, where Q is defined by (13) and 8 
by (17), (18) and (19). Therefore, 


is a diagonal block matrix, where 
R, = QiSi, 


and, since MW — WM, the matrix W is also a diagonal block matrix 
[W,, W.,: - -, We], where W; is of the same order as M;. Hence we see that 
in reducing R we may reduce each FR; separately by transformations W;, 
permutable with M;. As this is the case we temporarily drop all suffixes 1 and 
write M, R, 8S, q, T; ete. for Mi, Ki, Si, Qi, Tis respectively. 

We first show that without any loss of generality we may assume Si, 
to be non-singular. Since 7, =y72=2::*=m™=1, we may suppose that 
m= > 1 SsSk. If Su, is singular but, for some value 
of j Ss, Sj; is non-singular, by interchanging the first row and the j-th row 
of § and the first column and the j-th column, we move Sj; into the place 
of 8,,. Moreover such an interchange may be accomplished by means of a 
conjunctive transformation permutable with M and Q.1! We now assume that 
S;; is singular for all values of j, 1S js. If si; denotes the first element, 
l.e., the element in the first row and first column, of the matrix S4;, 
| Si; | =| sj; |” and, since sj; is a polynomial in the matrix p with coefficients 

We shall use the term conjunctive transformation to include the case of a 


congruent transformation; i.e., a transformation of matrix W, where W lies in K, 
so that W* — W’. 


“Turnbull and Aitken, An Introduction to the Theory of Canonical Matrices, p. 11. 


f 


482 J. WILLIAMSON. 


in K (equation (27)), sj; 0. In particular s,,—0, and, since, from the 
nature of S;, (equation (19) ), s::; = 0 when 1 > s, there is at least one value j, 
1<js, such that s;,540, as otherwise S would be singular. After a 
suitable interchange of rows and columns we may therefore suppose that s., 
is not zero. Let 


1 41 1 4] 


where and £, are the unit matrices of orders and (3 + 4s +m) 
respectively. The two matrices W, and W, are both permutable with M as 
are the matrices W*, and W*, with Q. A simple calculation shows that, if 


W*,QSW,—QX and W*,.QSW—QY 
and X and Y are partitioned similarly to M, 
Xn + Sox + Sie + Soo, Yu= + 4(Si2— S21) Boo. 


The first two elements of these matrices are respectively 21; = Si2 + $2; and 
Y11 = 1( 812 — 821), SINCE 81; S22 == 0. As So: £0, at least one of 2, or 
is different from zero, so that at least one of X,, or Y,, is non-singular. 
However, as W, is not a matrix over K, we must still show that, if 9 is a 
matrix over K, the matrix X,, is non-singular. This is in fact the case; for, 
since S* =, by (26) se: = 512, so that if S lies in K, 82: = S12. = 5,2 and 
211 == 282; 40. Hence we may assume without any loss of generality that 
83, is non-singular. 

We next show that 8 may be reduced to a diagonal block matrix parti- 
tioned similarly to M. Let 


0 E. 0 0 
W = 0 0 E; 0 , 
0 0 0 J 


so that W is certainly permutable with M. The element in the r-th place, 
r > 1, of the first column of W*Q is 


— = — = — by (24). 


Hence W*Q = QH, where H is obtained from W* by replacing the element 
in the r-th place, r > 1, of the first column by — S,,8;,-7.. A simple calcula- 
tion now shows that if, 


NON-SINGULAR PENCILS OF HERMITIAN MATRICES. 483 


W*QSW = QD = Q (Drs), 
Dy, = 811, Dr = Dir = 0, (rfl), 
Des = — 818, (r,s = 2, 


We have therefore shown that there exists a non-singular conjunctive 
transformation permutable with M, which reduces R=QS to the form 
H], where H is a square matrix of order ni(y2 +3 +° * and is 
commutative with [N.,N3,---,Nx]. Accordingly H is of exactly the same 
type as S and we may repeat our previous argument with 8 replaced by H. 
Hence in k —1 steps we deduce the existence of a non-singular matrix V, 
permutable with M, such that 


(34) V*¥QSV = Q[G, >, Gel, 


where G; is of the same order as Nj, (7 —1,2,---,k). Moreover Gj is 
permutable with NV; and Q;G; is hermitian. Hence by (27) G; is a matrix 
over K and Q;G; is symmetric. 

We now show that it is possible to reduce Gj, (j =1,2,°--,k), to a 
diagonal block matrix. For simplicity we write 7 —17j;, so that 


n-1 
G; => gaUj* (formulas (19) and (20)), 
a=0 


where gg is a polynomial in p with coefficients in K. Since G; is non- 
singular, go is non-singular and is accordingly different from zero. If 
= =0 and G is a diagonal block matrix. If 
€<y—1 and ge,,~0, we consider the matrix 


W = — wU;**1, where w= gos1/29o, 
80 that 
G;W? = G; — 2wU + w?U 
a=c+2 
Where hess = — 2gw and ha is a polynomial in p, when a=c + 2. 
But 
W*Q;G;W = W’qT;G,;W = qT;WG;W by (9) and (10), 
= = Hj. 


Hence the matrix W reduces Q,@; to Q;H;, where H; is of the same form as 
G; except that the coefficient of U;**t is now zero. If the coefficient of U;°* 
In H; is different from zero, we may repeat our argument with G@; replaced 


= 


484 J. WILLIAMSON. 


by H;. Accordingly in at most »—1 such steps we can reduce G; to the 
diagonal form g,/;. Let us therefore suppose that Gj; is already in diagonal 
form, so that, after an obvious change of notation, 


(35) G; = gE), 


and gj = gj;(p) is a polynomial in p with coefficients in K. Equation (34) 


therefore becomes, 


and we call the matrix on the right of this last equation a canonical form for 
the matrix QS. 

It is apparent that this canonical form may not be unique. Suppose 
therefore that there exists a non-singular matrix Y, permutable with M, 
such that 
(37) Y*QSY — QF —Q[fili, fobs, 


where f; —fj;(p) is a polynomial in p with coefficients in K. Then, if 
W = VY, W is permutable with M and, as a consequence of (36) and (37), 


W*QGW = QF. 


(38) 
If W= (Wyse), (r,s =1,2,:--,k), is a partition of W similar to that of 
M, Wrs and We, are of the same forms as S;~¢ and Ser in (19). We define 
W,. in an analogous manner to S,. in (21), so that in particular, if 4, =, 
Wrs = Wrs. The matrix equation (38) may now be written in the form 


k 
W*arglagaWas = SreqQT (1,8 =1,2,- the Kronecker 8). 
a=1 


Hence by (23) 


a=1 


or, on dividing by the non-singular matrix qT’, 


(39) WargaWas = 
a=1 


If w,4; is the first element of the matrix W,; and #;; the first element of 
the matrix W ij, it follows from the nature of the matrices W;,; and Wij (cf. 
equations (19) and (21)), that the first element of the matrix WargaWas is 
WarJaWas. Accordingly by equating the first elements of each component 
matrix in (39), we have 


(7 = 1, 


NON-SINGULAR PENCILS OF HERMITIAN MATRICES. 


(40) WarJaWas 
1 
But by (19) and (21), 
Was = 0, if Na < Ns; War = 0, if Na > nr; War = War, if Na >= 7r- 


Hence, if > yo > and CS8Sd, cSrZd, 
WarJaWas= 0, when a<cora>d. Accordingly (40) becomes 


d 
(41) WarJaWas 
Cc 


a= 

Let D be the matrix (dij), (1,7 where 
dij = Wesi-1,c+j-1. Then it is a consequence of the form of W that, since W 
is non-singular, D is non-singular, for, after a proper interchange of rows 
and columns, it can be shown that | D| is a factor of |W |. Each element 
d;; of D is a polynomial d;;(p) in the matrix p with coefficients in K (1) and 
therefore (41) may be written in the form of a matrix equation 


(42) D[ ges Jest," * ga] D= Lfe, fal, 


where D = (dij) = (dji), =1,2,---,d+1—c; ef. equation (21). 
It is important to notice that D is not the same as D*, since D* = (d*;;), 
where d*;; = dj;(p’). Let 2 be an indeterminate and let D(x) denote the 
matrix whose typical element is dij(a). Then, if 


(43) | D(x)| = p(x) + io(@), p(x), o(x) polynomials with coefficients in K, 


(44) | D| =| p(p) + t0(p)|-” 


Similarly | D | = | p(p) —io(p)|, so that 


(45) |D| |D| =|(e(p))?—#(o(p))? | =| w(p)|, 


where (x) is a polynomial in x with coefficients in K. Since D, and simi- 
larly, D, are both non-singular, D D is non-singular, so that by (45), »(p) is 
non-singular. Hence, since »(p) is a polynomial over K, 


(46) €0, 


is a necessary and sufficient condition that D and D both be non-singular. 

If 6 is a root of the irreducible equation p(x) =0, the field K(6) is 
simply isomorphic with the field formed by all polynomials in p with coeffi- 
cents in K. Consequently it follows from (45) and (46) that 


“J. Williamson, “The latent roots of a matrix of special type,” Bulletin of the 
American Mathematical Society, vol. 37 (August, 1931), p. 587. 


485 


486 J. WILLIAMSON. 


(47) | D(6)| | D(8)| = £0 


and accordingly that both matrices D(@) and D(@) are non-singular. Since 
the elements of D(@) are no longer matrices, D(@) = D*(@), and we therefore 


have, as a consequence of (42), 
(48) D*(9)[go(9),° = [fc(9),° - -, fa(9)], 


where D(@) and D*(@) are both non-singular. In other words the two 
matrices and [f-(@),---,fa(@)] are conjunctively equiva- 
lent. Conversely, if (48) is true and both D(6) and D*(@) are non-singular," 
(0) 0 by (47) and accordingly (46) is satisfied, so that (42) is true, 
where D is non-singular. Hence not only does (42) imply (48) but also 
(48) implies (42). 

Before summing up and stating our results in the form of a theorem 
it will prove convenient to alter our notation slightly. Accordingly we relabel 
the integers 7; in the following manner ; 


(49) é; > 2 > 


where s, + and write 


= [Ke, Bers, TAP = (Ne, Nea, ° Na], 

¥5 = Joss gal, oj = [fey fests’ 


(50) 


where Using this nota- 
tion we may express our last result in the form of a lemma; 


Lemma I. If the two canonical forms QG and QF of equations (37) 
and (38) respectwely, are equivalent, there exist 2r non-singular matrices 
D;(0), D*;(0) with elements in K (6,1) such that, 


(51) = $;(9), 


i. e., the matrices y;(0), (7 =1,2,° -,17), are equivalent under a non- 
singular conjunctwe transformation in the field K (6,1). 


The converse of this lemma is also true. In fact (51) implies that 


If lies in the field K(0), | D(@)| #0 does not imply | D*(6)| #0. 


(j= 1,2,-- 


NON-SINGULAR PENCILS OF HERMITIAN MATRICES. 487 


= 3, (7 where |D;|A~0. If W; is the matrix 
obtained from D; by replacing each element d of Dj by Eejssg...48,. It imme- 
diately follows that 


W; = (j =1, 
and since W; = W*;Q,™, that 
W* Ws = Qi 


Hence, if W—[W,, W.,---,W,], W is non-singular and W*QGW = YF, 
so that the two normal forms QG and QF are equivalent. 

In stating the theorem given below we use a notation conforming with 
that explained in (49) and (50); the matrices defined in (50) are associated 
with a particular polynomial p;(A) and for convenience in writing we dropped 
the suffix 1 but now we find it necessary to replace it. We have proved the 


theorem : 


THEOREM I. Let A and B be two matrices, of which the second B is 
non-singular, with elements in K(1t) and let A= A* and B=B*, If the 
elementary factors of A—AB are the polynomials [pi(A)]" of (3), then a 
canonical form for the pencil A—AB under a non-singular conjunctwe trans- 
formation is the diagonal block matriz 


(52) QG(M— XE) = (Lis J, (t= 1,2,°--,¢; 


where Q is defined by (13) while Lij, Tiz and I4; are defined by (50). 
Two canonical forms QG(M—AE) and QF(M—AE), where F = [44;], 
(t= --,7i), are equivalent, if and only if the diagonal 
matrices yi3(0;) and $4; (0;) are equivalent under a conjunctive transformation 
in the field K(0:,1), 


Thus, if [p;(A) ]* occurs exactly s;; times among the elementary factors 
of A — )B, in a canonical form (52), there is associated with this elementary 
factor a diagonal matrix y;;(6;) of order si;, whose elements lie in the field 
K(6,), where 6; is a root of the irreducible equation pj(A) =0. The matrix 
7vij(%) is determined apart from a conjunctive transformation. 

Throughout we have used conjunctive transformation to include the case 
of congruent transformation. We therefore see that, if A and B are symmetric 
matrices over K, Theorem I is true when ‘ conjunctive’ is replaced by ‘ con- 
gruent’ and K(i) is replaced by K. 

We now state two corollaries of Theorem 1. 


J. WILLIAMSON. 


Corotuary I. We may determine the matrices yi; of a canonical form 
(52) in such a way that no element of yi; contains a factor r’, where r is a 
polynomial in the matrix p with coefficients in K. 


For, if i; = pij*yij, and pij is a diagonal matrix whose elements are 
polynomials in p; with coefficients in K, p*isyijpij = ij, 80 that gi; is 
equivalent to 


II. Two pairs of hermitian matrices A,B and C, D, with 
elements in K(i), the second of each pair being non-singular, are equivalent 
under a non-singular conjunctive transformation in K(t), if, and only if, 
the two pencils A —2AB and C —XD have the same elementary factors and, 
if the matrices yi;(6:), associated with each distinct elementary factor, are 
conjunctively equivalent in K (1,4). 


Corotiary III. Corollary II remains true if hermitian is replaced by 
symmetric, conjunctwe by congruent and K(t) by K. 


We may use Theorem 1 to determine a canonical form for any non- 
singular pencil of matrices A—AB with elements in K(i). For, if B is 
singular but | 4 —AB | +£0, we may determine a new basis for the pencil, 
A, and B,, where B, is non-singular and A —AB = A, — pB,."* We apply 
Theorem I to the pencil A; —pB, and thus determine a canonical form for 
the non-singular pencil A — AB. 


3. Ordinary hermitian matrices and real symmetric matrices. {i 4 is 
the field of all real numbers and K(i) the complex number field, the poly- 
nomials p,(A) of (3) are either quadratic or linear. If pi(A) is quadratic, 
kK (6,) = K(t) and hence, if y:;(6;) is one of the matrices associated with 
pi(A), yis(0:) is a diagonal matrix, whose elements are complex numbers. 
Let yi; = [gr] and let W = [w,], where w, = if g- and w, = 1, 
if g-==0. Then the matrix W is a non-singular matrix with elements in 
K(6,), as is the matrix Wt. But (W-*)’yi;(0:) W- is the identity matrix. 
Hence each matrix y;;(6;) associated with p;(A) may be reduced to the identity 
matrix of the corresponding order. If p;(A) =A? — + a;? + we 


may choose for p; the matrix (3 Te and for gi the matrix 6 ): 
i i 


If, however, 7;(A) is linear, K(6;) — K, the field of all real numbers and by 
Corollary 2 each associated y;;(6;) may be reduced to a diagonal matrix with 
elements, which are either +1 or —1. If pi(A) =—A—A,, p=A, and 


™ Cf. Turnbull and Aitken, op. cit., p. 117.sq.; Trott 1, p- 370. 


488 


NON-SINGULAR PENCILS OF HERMITIAN MATRICES. 489 


q=1. The normal form (52) therefore coincides with the normal form 
given by Trott 1, page 368, formula (11). In this particular case, however, 
the condition of Lemma 1 is greatly simplified, for two diagonal matrices 
with real coefficients are equivalent under a conjunctive or congruent trans- 
formation, if, and only if, they have the same signature. Thus Trott’s con- 
dition (15) merely expresses the fact that yi;(4:) is conjunctively equivalent 
to ij (i). 

In the general case no such simplification of the conditions in Lemma 1 
is possible. The conditions for the equivalence of two quadratic or hermitian 
forms have been determined but are very complicated.1®° We however state 
necessary and sufficient conditions in the two simplest cases (a) 84; =1, 
(b) si; = 2. These conditions are due to Dickson. 

(a) If yis(@:) is of order one, yij(;) is an element of K(6;). Then 
yij(0:) ts equvalent to $i;(0:) if, and only if, there exists an element f of 
K (1, such that 

bij = ffyis(%). 

(b) If the matrices y;;(6;) and $i;(0;) are both of order two, they may 

be represented as ) and ( respectively. Then yi;(0;) 1s equiva- 


lent to pi;(0:), if, and only if, there exist elements f, g and h of K(6,1), 
such that + and In the symmetric case the 
elements f, g, h lie in K 

4. We conclude our discussion by giving an explicit form for the matrices, 
pi and qi, which occur in the canonical form (52). The matrix p= p; is a 
matrix of order n = nj, whose characteristic equation is the irreducible equa- 
tion pi(A) = p(A) —0 of degree n. If 

we may take for the matrix p the companion matrix of p(A), 
0 


1 0 
0 1 


0 0 


*L. E. Dickson, “On quadratic, bilinear, and Hermitian forms,” Transactions of 
the American Mathematical Society, vol. 7 (1906), pp. 275-292; “On quadratic forms 
ina general field,’ Bulletin of the American Mathematical Society, vol. 14 (1907-8), 
Pp. 108-115; H. Hasse, “Symmetrische Matrizen in Kérper der rationalen zahlen,” 
Crelle, vol. 153, pp. 12-43. 


490 J. WILLIAMSON. 


and for g = q,; the matrix 


0 
0 


0 


4 0 bs Dn-2 Dns 


where b, =; and = + + didn, (4 = 2, 
It is obvious that g is non-singular, for p is non-singular and hence b, =a, 0. 
Moreover it is easily verified that gp = p’q. The matrix q is not uniquely 
determined by the matrix p, but the matrix qg in (53) is as simple in form as, 
if not simpler than, any other matrix q, satisfying the equation p’qi = 9q:p. 


THE JOHNS HOPKINS UNIVERSITY. 


0 ob 
(53) 0 0 0 bs 
o« 
bn 
I 
ti 
A 
an 


ON THE RATIONAL CANONICAL FORM OF A FUNCTION OF A 
MATRIX. 


By Neat H. McCoy. 


Let A be a matrix of order n with elements in the complex number field, 
and #(A) a given rational integral function of A. In 1906, Kreis* gave a 
method of determining the elementary divisors of ¢(A) from those of A. 
In recent years the same problem has been discussed by Krishnamurthy,’ 
Turnbull and Aitken,* Rutherford * and Amante. It is sufficient to consider 
the case in which A has a single elementary divisor (A — a)", as the general 
case easily reduces to this one. The principal result of these writers may then 
be stated in the following way. Expand ¢(A) in powers of A — a, 


$(A) =a + a,(A—a) 


Suppose the i-th number of the sequence 4, d2,° - -,@n-1, 1, is the first which 
is not zero. Define positive integers & and / by the relations, 


n= (k—1)1+l, k=1, 


Then ¢(A) has the elementary divisors (A — a))* taken / times, and (A— do)** 


taken 1 — times. 

So far as the writer is aware, no solution has been given of the problem 
corresponding to this one, for the case in which the elements of A and all 
operations are restricted to an arbitrary domain of rationality. In this more 
general problem one does not have the use of the comparatively simple Jordan 
normal form of a matrix, and a different method of attack must therefore be 
used. It is the purpose of the present paper to present a solution of this 
problem. 

In § 4, we shall also give a brief account of an application of the main 
result to the solution of certain matric equations. 


1. The rational canonical form. Let K denote a given field. Unless 


1H. Kreis, Contribution a la théorie des systémes linéaires, Ziirich, 1906. 

*Rao S. Krishnamurthy, “ Invariant-factors of a certain class of linear substitu- 
tions,” Journal of the Indian Mathematical Society, vol. 19 (1932), pp. 233-240. 

*H. W. Turnbull and A. C. Aitken, Canonical Matrices, Glasgow, 1932, pp. 75-76. 

*D. E. Rutherford, “On the canonical form of a rational integral function of a 
matrix,” Proceedings of the Edinburgh Mathematical Society II, vol. 3 (1932), pp. 135-143. 

*§. Amante, “Sulle riduzione a forma canonica di una classe speciale di matrici,” 
Atti della Reale Accademia Nazionale dei Lincei, Rendiconti VI, vol. 17 (1933), pp. 31-36 
and pp. 431-436. 

491 


= 


492 NEAL H. MCCOY. 


otherwise stated, it will be assumed henceforth that all matrices and vectors 
have codrdinates in K, and all polynomials have coefficients in K. If a poly- 
nomial is irreducible relative to the field K, we shall simply say that it is 
irreducible. 

Let f(A) =A? - -— ay be a given polynomial, and form 


the matrix, 


This matrix may be called the companion matrix of the function f(A) or of 
the equation f(A) —0.° The minimum function of B is then |A— B| =f (A). 
Let now A be a given matrix of order n, and £;(A), (i=1,2,- - -,7r), 
the non-constant invariant factors of A—A. Perhaps the most common 
rational canonical form of A (with respect to similarity transformations) 
is a matrix A,, which is the direct sum’ of the companion matrices of the 
E,(A). However, it will be convenient for our purpose to use a somewhat 
different rational canonical form, which will now be described. 
If we factor the H(A) into powers of distinct, irreducible polynomials 
px(A), each of which has leading coefficient unity, say 


Hi = [pi(a) ]*o[pe(A) [pr (a) 


then such of the factors [jx(A) |" as are not mere constants may be called 
the elementary divisors of A. We can then choose as a canonical form of A, 
a matrix A», which is the direct sum of the companion matrices of the ele- 
mentary divisors of A.* This is the canonical form used throughout this paper. 
The advantage of this form over the other lies in the fact that if A = C 4. D, 
the canonical form of A is the direct sum of the canonical forms of C and D, 
and the elementary divisors of A are the elementary divisors of C’, together 
with those of D. 


°See C. C. MacDuffee, The Theory of Matrices, Berlin, 1933, p. 20. 


7If A _ >) where © and D are square matrices, then A is called the direct 


sum of OC and D, and we write, A = 0 4 D. 

*W. Krull, “Theorie und Anwendung der verallgemeinerten Abelschen Gruppet,” 
Sitzwngsberichte Heidelberger Akademie der Wissenschaften, 1926, pp 25-28; B. L. van 
der Waerden, Moderne Algebra, vol. 2, Berlin, 1931, p. 137. For a somewhat different 
canonical form, but one which also uses the notion of elementary divisors, see J. H. M. 
Wedderburn, “ Note on matrices in a given field,” Annals of Mathematics, vol. 27 
(1926), pp. 245-248. 


| 
| 


CANONICAL FORM OF A FUNCTION OF A MATRIX. 493 


We shall now establish two lemmas” which will be useful also in a later 
section of the paper, and then apply them to show how to find a non-singular 
matrix which transforms A, into A>. 


Lemma 1. Let Vj; be a given row vector of dimension n, X an arbitrary 
column vector of dimension n, and B a given square matria of order n. 
Denote by R; the matria of e; rows and n columns, whose rows are respectively 
the vectors 


If now hj(A) =A% —- -—de,, is a polynomial such that 
Vjh;(B) =0, and we set &; = RjX, Y = BX, nj = R;Y, then tt follows that 
ni = where Q; is the companion matrix of hj(X). 


By definition, we see that = {&)1, €je,} and nj = {nj1, 
are column vectors of dimension ej. The lemma follows at once from the 
following calculation: 


Wie, ™ Vj b1€),¢,-1 + bo€j,6,-2 + + 


LemMA 2. Let e; (j =1,2,:--+,q) be positive integers whose sum is n, 
and for each j suppose Vj, Rj, &), nj, hy(AX), Q;, satisfy the conditions of 
Lemma 1. If we set 


Ry 


then E= RX, Y = BX, n= RY, and n= QE, where Q=Q, Ae ove Qa. 
Further, if R is non-singular, then Q = RBR*. 


; The first part follows almost immediately from the preceding lemma. 
We then find that RBX —QRX. But since X is entirely arbitrary, we must 
have RB —QR. Hence if F is non-singular, Q = RBR*. 


*I am indebted to a referee for suggesting the introduction of these lemmas. Their 
use has considerably improved the proof of Theorem 1. 


). 
on 
he 
vat 
V 5X = 
ni 
j,e;-1 = V ej 
led 
A, 
sle- 
D 
R 
as 
. 1 
> 
&q 
rect 
en,” 
yall 
rent 
M — 


494 NEAL H. MCCOY. 


Let U = (1, Us,* * *,Un) be any row vector. Then there exists a unique 
polynomial g(A) of minimum degree and with leading coefficient unity, such 
that Ug(A) =0.1° This polynomial g(A) is called the R.C.F. (Reduced 
Characteristic Function) of A relative to U, and its degree may be called the 
grade of U (relative to A). A fundamental property of the R.C.F. of A 
relative to U is that, if h(A) is a polynomial such that Uh(A) = 0, then 
h(A) is divisible by g(A). 

We now return to the problem of transforming the matrix A, into A,. 
Since A, is the direct sum of the companion matrices of the invariant factors 
of A— A, we may assume for our purpose, that A — A has a single invariant 
factor H(A) of degree n. If H(A) is a power of an irreducible polynomial, 
then clearly A,. Hence suppose = ¢$(A)¥(A), where $(A) and 
y(A) are relatively prime, and have leading coefficients equal to unity. Let 
the degrees of ¢(A) and y(A) be respectively nm, and no. 

From the form of A;, it follows that the vector, U; = (1,0,---,0), 
is of grade n relative to A;, and the R.C. F. of A, relative to U, is therefore 
E(A). We may now apply the above lemmas by placing e, = ne, e2 =m, 
B= A,, Vi = U16(A1), V2 = Ui(A1). Since = 90, it follows 
at once that h,(A) = h2(A) = If now we assume for the moment 
that R is non-singular, we see by means of Lemma 2 that RA, R* = Q, where 
Q is the direct sum of the companion matrices of ¢(A) and of y(A). If either 
(A) or w(A) can be expressed as a product of relatively prime factors, the 
process can be continued, and so on until the form A, is reached. 

We now show that F is non-singular. For suppose there exists a relation 


Aa’ + dey (As) = 0. 
=0 4=0 
Since Z(X) is the R. C. F. of U, relative to A;, this implies that the polynomial 
M%-1 
F(A) = + drt, 
=0 


is divisible by F(A), and being of degree at most n—1, must therefore 
vanish identically. From the fact that #(A) and y(A) are relatively prime, 
it follows that all coefficients c; and d; must be zero, and thus RF is non-singular. 


2. Another lemma. Let p(d) and ¢(A) be given polynomials, of which 
the first is irreducible and of degree s = 1. Since there exist at most s poly- 
nomials which are linearly independent modulo p(A), it follows that the 
polynomials 1, ¢, ¢7,- - -,¢* are linearly dependent modulo p(A). By 4 


10'Turnbull and Aitken, op. ctt., chap. 6. 


| 
i 


ither 
, the 


ation 


omial 


-efore 
rime, 
rular. 


which 


poly- 
tthe 


By 4 


CANONICAL FORM OF A FUNCTION OF A MATRIX. 495 


familiar argument, there then exists a unique polynomial f(x), with leading 
coefficient unity, and of minimum degree, such that 


f(o(A)) =0 (mod p(A)). 


It follows readily that f(x) is irreducible, and also that if g(x) is a poly- 
nomial, such that g(¢(A)) =0 (mod p(dA)), then g(x) =0 (mod f(z)). We 
shall let ¢ denote the degree of f(z). 
Let p denote a root of p(A) = 0 in a properly extended field, and consider 
the three fields, 
K CK($()) CK(p). 


The field K(¢(p)) is seen to be of degree ¢ over K, as $(p) satisfies the 
irreducible equation f(z) 0. Also K(p) is of degree s over K. Hence, by 
a well known theorem,™* K(p) is algebraic of degree m = s/t over the field 
K(¢(p)). That is, ¢ is a divisor of s, and p satisfies no equation of degree 
less than m with coefficients in K(¢(p)). We may now prove the following 
lemma : 


LemMA 3. If F (x,y) ts a polynomial in the indeterminates zx, y, of 
degree at most m —1 in y, and tf 


F($(A), A) =0 (mod p(d)), 
then F(2,y) =0 (mod f(x)). 


Under the hypotheses of the lemma, we have F(¢(p),p) =0. Let 
m-1 m-1 

P(t,y) => Fi(x)y'. We have then Fi(¢(p))p*—0. But if some 
i=0 


Fi(¢(p) ) 0, this contradicts the fact that can satisfy no equation of degree 
less than m with coefficients in K(¢(p)). Hence Fi(¢(p)) = 0, and thus 
Fi(p(A)) = 0 (mod p(a)), (t=0,1,---,m—1). It follows that each 
= 0 (mod f(z)), and the lemma is established. We remark that if 
the degree of F(x, y) in x is at most ¢—1, then F(z, y) vanishes identically. 


3. The elementary divisors of ¢(A). Wecome now to the main problem 
ofthe paper. Let A be a given matrix, and (A) a given polynomial in A. 
Since ¢(HAH-) — Ho(A)H", there is no loss of generality in assuming that 
Aisin canonical form. If A = A; + A», then ¢(A) ~¢(A,) + and 
the elementary divisors of ¢(A) are precisely those of ¢(A,), together with 
those of ¢(A.). We shall therefore assume henceforth that A has a single 
dementary divisor [p(A)]*. It follows that the minimum function of A is 
[p(A)]*. If the degree of p(A) is s, then the order of A is n —rs. 

™ See, e. g., van der Waerden, op. cit., vol. 1, p. 98. 


3 


ique 
such 
uced 
the 
f A 
hen 
A». 
tors 
ial, 
and 
Let 
,0), 
efore 
=m) 
llows 
ment 
|| 


496 NEAL H. MCCOY. 


Let f(z) denote the unique irreducible polynomial of degree ¢ defined in 
the preceding section. Then we have 


f(o(A)) =0 (mod p(d) ). 


| It may well happen that f(#(A) ) is divisible by a power of p(A) greater than 
| the first. Suppose that it is divisible by [p(A)]¢ but not by [p(A)]%". We 
now define an integer 1 as follows. If g=1, we set 17, while if ¢ <r, we 
place i= q. Hence in either case we have f(¢(A)) = 0 (mod [p(A) ]*). We 
further define positive integers k, / by the relations, 


1. 


IIA 


It follows that [f(¢(A) )]*=0 (mod [p(d) ]"), while 
[f(o(A)) A0 (mod [p(A) 


The minimum function of ¢(A) is therefore [f(A)|*, and the elementary 
divisors of ¢(A) are all powers of f(A). If we denote the integer s/t by m, 
we may state the following precise result: 


THEOREM 1. The matriz $(A) has as elementary divisors, [f(A) ]* 
taken lm times, and [f(A) ]** taken m(i—l) times. 


We shall prove this theorem by actually exhibiting a matrix R which 
transforms ¢(A) to canonical form. Let U denote a vector of grade n =153 
with respect to A.* The R. C.F. of A relative to U is then [p(A) ]’. 

Let a and B be integers such that OS ¢@Si—1,0=Bi=m—1. We 
shall now make use of Lemma 1, the notation being as in the statement of 
the lemma, with the exception that we shall find it convenient to replace each 
subscript j by the two subscripts « and ~. That is, e; becomes égg, hj(A) 
becomes /gg(A), and so on. Two cases will be considered separately. 


Case 1. OS e@S1—105 8 = m—1. Let cas — tk, 
—U[p(A)]*A2, B—$(A). Since U[p(A)]*4°[f($(A)) it 
follows that hag(A) = [f(A)]*, and Qag is therefore the companion matrix 
of [f(A)]*. The matrix Rag has as rows the vectors, 


- -, U[p(A) 


Case2,. In this case, let egg = t(k —1); 
Vag = U[p(A)]*44, B= (A). Since now f($(A)) is divisible by [p(a)]}, 


(2) U[p(A)]*4%, 


“If A is in canonical form, we may choose U = (1,0,- - -,0), as in § 1. 


| 
| 
| 
| 


CANONICAL FORM OF A FUNCTION OF A MATRIX. AQ’ 


it follows by relations (1), that U[ p(A) ]*A®[f(¢(A) ) ]** = 0, and thus that 
hap(A) = [f(A)]**. The matrix Rag has as rows the vectors, 


(3) U[p(A)]*4*, +, U[p(A)]*44[ (A) 


It is easily seen that 3 cap =n, and the hypotheses of Lemma 2 are 
a=0 B=0 

satisfied. 'The matrix R then is formed by arranging the matrices Rag 
B=0,1,- - -,m—1) in some fixed order, and using 
the same order for the ag and ag to define € and 7 as in the statement of the 
lemma. Let us now assume for the present that # is non-singular. We then 
have 9 = Ro(A)#", and by the determinations of hag(A) above, we see that Q 
is the direct sum of the companion matrix of [f(A)]* taken ml times, and 
of the companion matrix of [f(A) ]** taken m(1—1) times. But since f(A) 
is irreducible, Q is therefore the canonical form of ¢(A), and the elementary 
divisors are those stated in the theorem. 

There remains only to prove that #& is non-singular. Any linear com- 
bination UF'(¢(A), A) of the row vectors of R (of the types (2) and (3)) 
corresponds to a polynomial F(z, y) of the form 


(4) y) y) (p(y) 


where the degree of Fj(z,y) is at most m—1 in y, while its degree in z is 
at most tk—1 for 7—0,1,--:-,/—1, and at most ¢(k—1)—1, for 
j=l,l1+1,---,i—1. If the linear combination of the rows of FR is the 
zero vector, we have 


UF (¢(A),A) 
and since the R. C. F. of A relative to U is [p(A)]", it follows that 
(5) F($(A),4) =0 (mod [p(a)]*). 


We shall complete the proof by showing that under these conditions, all 
F;(z,y) vanish identically, and thus the rows of R are linearly independent. 

We first dispose of the special case in which i =r, and hence k = 1,1—r. 
In this case, all F;(a,y) are of degree at most t—1inz. From relation (5), 
we find that 


Fi ($(),a) (mod [p(a) }"). 


Now clearly F,(¢(A),A) ==0 (mod p(A)), and by Lemma 3, it follows that 
F,(z,y) =0 (mod f(z)). But being of degree at most ¢—1 in a, Fy(2, y) 


ry 

l, 

ich 
rs 

We 
of 

ach 

(A) 
tk, 

, it 

trix 
1), 


498 NEAL H. MCCOY. 


must vanish identically. We now pass on to F(a, y), and a similar argument 
shows that it is also identically zero. A continuation of this process estab- 
lishes the fact that all F;(z,y) vanish identically. 

Suppose now thati<r. By definition of 1, we know that f(#(A)) is 
then divisible by [p(A)]* but not by [p(A)]***. We now assume that all 
F;(z,y) are divisible by [f(z)]” where 0S y < k —1, and shall show that 
they are all divisible by [f(z)]7”*. If we set Fj(a,y) = y), 
we get from (5), 


(6) a) = 0 (mod [p(A) 


Clearly F’)(¢(A), 4) = 0 (mod p(A)), and by Lemma 3 we have 
F’,(z, y) =0 (mod f(z)). Suppose that F’j(z, y) = 0 (modf(z)), 
=0,1,---,8) where O=8<1i—1. Since r—yit>1, it follows that 
F’s.1(¢(A), A) =0 (mod p(A) ), and thus F’s,,(z, y) =0 (mod f(z)). Hence 
F’;(z, y) =0 (mod f(z)), (7 It therefore follows that 
all F';(x,y) are divisible by [f(z)]%**, and a process of induction then shows 
that they are all divisible by [f(z) ]**. But the F;j(z,y) (j =1,1+1,---,i1—1) 
are of degree at most t(—1) —1 in z, and hence must vanish identically. 

Now let Fj(2z, y) = [f(x) ]*°F*;(2,y), (7 =0,1,---,1/—1). From 
relation (5) we then have 


A repetition of the argument of the preceding paragraphs shows that each 
F*;(z,y) is divisible by f(x), and thus Fj(z,y) is divisible by [f(z) ]*, 
=0,1,---,/—1). But these y) are of degree at most tk —1 ing, 
and must therefore vanish identically. This completes the proof of the theorem. 


Examples. Let K be an algebraically closed field, and suppose A has 
the single elementary divisor (A—a)". Then in terms of our notation, 
we have p(A) =A—a,s—1,r—n. Let now ¢(A) be expanded in powers 
of A— a, 


$(A) =a + 4,(A—a) + 
Then clearly ¢(A) —a)=0 (mod (A—a)), so that f(x) =2—dp, and 


t=1,m—s=1. If now the first number of the sequence d2,° , Gn-1,1, 
which is not zero is the i-th, then we have 
f($(A)) =0 (mod (A—«)*), 


while if i <n, 
f($(A)) (mod (A —a)*"). 


| 
| 

| 


CANONICAL FORM OF A FUNCTION OF A MATRIX. 


499 


Thus this definition of 7 corresponds to that given in the notation above, and 
if we define & and / by the relations (1), our theorem tells us that ¢(A) has 
the elementary divisor (A — d))* taken / times and (A — a))** taken (i — 1) 
times. Thus our general theorem reduces to the one obtained previously for 
this case by the writers referred to in the introduction. 

As a second example, let K be the field of real numbers, and A a matrix 
of order n = 6, with the elementary divisor (A* + 1)*. We may take A in 
the canonical form, 


> 


oOrooso © 


oo oc 


Then p(A) = + 1, s = 2, r= 3. Suppose A? + 3A. Then 
$(A) == 2d (mod ), ]?==— 4 (mod p(A) ), and hence f(a) == 2? + 4. 
We have then? 2,m—=1. Itis easily verified that ¢? + 4==0 (mod [p(A) ]?), 
but 40 (mod [p(A)]*). Hence Our theorem then 
states that ¢(A) — A* + 3A has the elementary divisors (A? + 4)?, A? + 4. 
Thus the canonical form of ¢(A) is the matrix 


@2 
060 0 0 


The vector U = (1, 0,0,0,0,0) is of grade 6 relative to A, and so the matrix 
is a matrix whose rows are respectively U, Up(A), U[o(A) ]*, U[¢(A) 
U(A? +1), U(A*+1)¢(A). A calculation shows that 


ry 
—1 06 0 
0 —6 0 8 
1 0100 0 


It is easily verified that R¢(A) = QR, and hence that Q = Ro(A)R". 


|| 
| 
h Q= 
8 
d 

= 


NEAL H. MCCOY. 


4. Solution of matric equations. Let B be a given matrix of order n, 
and (A) a given polynomial in the scalar variable A. It will be understood 
that all elements and operations are to be restricted to the given field K. 
We shall now give a brief account of an application of the results of the 
preceding section to the solution of the equation, 


(8) $(X) 


where X is a matrix of order n to be determined. A different method of 
solving this equation has recently been given by Ingraham.” 

If S is a non-singualr matrix, then ¢(SXS*) = S¢(X)S* = SBS". 
Hence there is no loss of generality in assuming that B is in canonical form. 
We shall assume henceforth that B is in canonical form, and is therefore the 
direct sum of the companion matrices of its elementary divisors. We observe 
that, if X is a solution of the equation (8), then SXS~ is also a solution, 
if and only if 8S is commutative with B. 

We shall consider first the case in which the elementary divisors of B are 
all powers of a single irreducible polynomial f(A). Let 


be the decomposition of f(¢(A)) into powers of its distinct irreducible factors, 
each with leading coefficient unity. It follows easily that, if p(A) is any 
irreducible polynomial, with leading coefficient unity, then f(z) is the unique 
minimum polynomial (defined in § 2) such that 


f(p(A)) =0 (mod p(A)), 
if and only if p(A) is one of the p;(A) occurring in (9). 
Let X denote a solution of the equation (8), and Y the canonical form 


of X. That is, Y—Y, Y2 Y,, where the Y; are the companion 
matrices of the elementary divisors of X. Then 


$(Y) + $(¥2) +: 


is similar to B, and the elementary divisors of B are precisely the elementary ° 
divisors of all the ¢(Y;). But by the results of the preceding section, ¢( Yi) 
can have elementary divisors which are powers of f(A), if and only if Yi is 
the companion matrix of some power of a pi(A) occurring in (9). Suppose 
then that the elementary divisors of Y are 


#8 M. H. Ingraham, “On the rational solutions of the matrix equation P(X) = A,” 
Journal of Mathematics and Physics, vol. 13 (1934), pp. 46-50. For additional references 
to matric equations see MacDuffee, op. cit., chap. 8. 


| 
| 

500 

| 

| 


CANONICAL FORM OF A 


A MATRIX. 


FUNCTION OF 


[pe (A) ]™, [pu (A) 


where mij = ii if 1 > 7. Let the degree of pi(A) be denoted by Ni. Then 


we must have 


We are now in a position to give a method of finding all solutions of the 
equation (8). Form the diophantine equation (11), and solve it for the nj; 
under the condition that nj = ni if 1 > Jj. Each solution gives us the ele- 
mentary divisors (10) of a matrix, which is a possible solution of our equation. 
Form the matrix Y, which is the direct sum of the companion matrices of 
these elementary divisors. Then by Theorem 1, it is easy to find the ele- 
mentary divisors of ¢(Y). If these elementary divisors are not the same as 
the elementary divisors of B, this Y is discarded. If, however, the elementary 
divisors are identical, then ¢(Y) is similar to B, and the proof of Theorem 1 
shows how to find a matrix R such that Ro(Y)R*=—B. If we let X¥ =RYR", 
then X is a solution of the equation (8). If X,X2,- - -,Xq is a complete set 
of dissimilar solutions, all of which can be found by this method, then the 
most general solutions are of the form LX;,L, where LZ is a non-singular 
matrix commutative with B. 

It is not difficult to write out additional equations, which together with 
the equation (11) will serve to determine completely the admissible matrices 
Y, but the tentative procedure outlined above is perhaps as easy to apply in 
any given case. | 

We now return to the general case in which the elementary divisors of B 
are unrestricted. Suppose these elementary divisors are 


[f:(A)]"™, Lf (A) 


where the f;(A) (i= 1,2,---, 1) are distinct and irreducible. We may then 


Write B= B,+B,+---+ Bi, where is the direct sum of the companion 
matrices of the elementary divisors occurring in the i-th row of the table (12). 
We shall now prove the following theorem: 


501 


502 NEAL H. MCCOY. 


THEOREM 2. If X is any solution of the equation (8), then 


where X, is of the same order as Bj, and is a solution of the equation, 
$(X) = B, 
Let 


(4—=1,2,---,1) 


be the decomposition of the f;(¢(A)) into powers of distinct irreducible fac- 
tors, each with leading coefficient unity. If X is a given solution of equation 
(8), it follows by an argument similar to that used above that the elementary 
divisors of X are all powers of the pij(A) (1 =1,2,-°-,1; 7 = 1,2,°°-, 
Let Y; denote the direct sum of the companion matrices of the elementary 
divisors of X which are powers of the functions ji(A),° ° -, pis,(A) 
(i—1,2,---,1), and set Since the fi(A) are 
distinct, it follows that the pi;(A) are all distinct, and the elementary divisors 
of ¢(Y) which are powers of f;(A): are precisely the elementary divisors of 
¢(Y;). Hence Y; is of the same order as B;, and #(Y;) is similar to B; 
Let us set SXS* Y, — B, 
T=T7,+7,+---+T7). We have then 7¢(Y)T-?=B, from which it 
follows that ¢(TSXS-*T-*) = B, and thus T'S is commutative with B. It is 
then known ** that 7'S is of the form M, + M,+-- +--+ M1), where M; is of 
the same order as B;, and is commutative with B;. <A calculation shows that 


We find also that = Mi (Vi) TO Mi = OBiMi = Bi. 
The theorem is therefore established. 


By means of this theorem, the solution of the general equation (8) is 
seen to reduce to the solution of a set of equations of the comparatively simple 
type, in which the elementary divisors of B are all powers of a single irre- 
ducible polynomial. Thus all solutions can be found by the method discussed 


earlier in this section. 


SMITH COLLEGE, 
NORTHAMPTON, MASS. 


14 See O. Schreier and B. L. van der Waerden, “ Die Automorphismen der projektiven 
Gruppen,” Abhandlungen aus dem Mathematischen Seminar der Hamburgischen Uni- 
versitit, vol. 6 (1928), p. 308. 


= 


ON CERTAIN TYPES OF HEXAGONS.* 


By J. R. MussELMAN. 


1. The resolvent, Vi + + ++ where is a 
primitive n-th root of unity, was introduced by Lagrange?’ in his memoirs 
devoted to the fundamental principles of the solutions of the cubic and quartic 
equations. Its entrance, however, into the field of geometry is very recent. 
If we represent any point P in the plane by the single complex number p, 
and if M;, (k=0,1,--+-,n—1) represents the n vertices of a positively- 
ordered polygon, then when the codrdinates p, of these vertices are subject 
to one and only one condition, namely that 

n-1 
=0 
k=0 
we obtain a polygon which we shall call a positive n-gon of type M. R. L. 
Echols * has used these polygons in giving geometric pictures of the solutions 
of the cubic and quartic equations. ‘The writer* has pointed out recently 
two different constructions in which these n-gons of type M occur. 

In addition to the above studies of this particular type of n-gon, the 
Lagrange resolvents (and their conjugates) have been used by L. M. Blu- 
menthal ° to prove that the norm-area of a 2n-gon is unaltered by translating 
either of its component n-gons, The Morleys ° in their recent book have shown 
that under homologies the n—1 Lagrange resolvents for a n-gon form a 
complete system of relative invariants, and have used them in considering 
some special ordered n-points. In this article, the Lagrange resolvents are 
used to disclose some new facts about a well-known figure, to characterize 
certain interesting ordered six-points, and to prove that connected with any 


* Read before the National Academy of Science, November 19, 1934. 

* Memoirs of Berlin Academy, 1769; reprinted in Oeuvres de Lagrange (Paris, 
1868), vol. 3, p. 207. 

*The Roots of Circulants and Application to the Roots of Polynomials. The 
University of Virginia (1928). 

*“On certain types of polygons,” The American Mathematical Monthly, vol. 40 
(1933), p. 157. 

*“ Lagrange resolvents in Euclidean geometry,” The American Journal of Mathe- 
matics, vol. 49 (1927), p. 511. 

*Inversive Geometry, G. Bell & Sons, London (1933), p. 203. 


503 


n, 
on 

Ty 

Ty 
d) 
ire 

of 

Bi 
Bi, 

it 

is 

of 

at 

Dis 

is 
ple 

Te- 
sed 

yen 


504 J. R. MUSSELMAN. 


six points there are two circumscribed hexagons, whose opposite sides are 
parallel and whose vertices lie on rectangular hyperbolas. 


2. If on the sides of any triangle’ A,A2A; we construct the positively 
ordered equilateral triangles A,A;A1;, A24iAz, and the codrdinates 
of the three vertices are A13(— wd, — w7ds), Aoi (— wd, — w*a,), and 
Ag2(— w3 — waz) where w* =1. The vector A;2A, is the Lagrange resolvent 
a, + + which we shall term Similarly, the vectors A,;A, and 
Az,Az are wl. and wu. Hence, we have the well-known theorem that the 
Lagrange resolvents A32A,, A;,A2 and A2,A; are equal in length and intersect 
at angles of 27/3. These vectors meet at a point f, whose codrdinate is 


(2.1) fe = 9 — Ucth; g = (a; + a2 + az) /3 


Similarly, if we construct on the sides of the triangle A,A2A; the positively- 
ordered equilateral triangles A,A2Ai2, A2A3A23, and A3A,A3; then the vectors 
Ay3A;, and are respectively wu, and w?u, where wu, is the 
Lagrange resolvent a; + waz + w*a3. These vectors meet at the point f, whose 
coordinate is 

(2. 2) 1 = 9 — 


The points f; and f, are variously known as the Fermat points’ of the triangle 
A,A,A; or as the isogenic centers.® 

The area of the triangle A,,;A2:Az2 is five-halves that of A,A2A; plus 
3%s?/8, while the area of A3,A12A2; is five-halves that of A,A,A; minus 
3%s?/8 where s? = A,A,?-+ A,A,?-+ Hence, the sum of the areas 
of the triangles A,3;A2,Ag2 and As;Ai2A23 is 5 times the area of the triangle 
A,A,A;. The hexagons A12Ae1A32A13 Agi and 
A32A31:A23A32A13A0; are n-gons of type M, i.e. hexagons for which the Lagrange 
resolvent V, vanishes. Their areas are respectively (A,A2? + A;A,”)3%/4, 
+ AyA,2)3%/4 and + 4,A,2)3%/4. In terms of their five 
Lagrange resolvents these hexagons can be characterized respectively as 
V, =2V.+ oV, = Vz — = 0; Vi = 2V2 + Vag = Vs — = 9; 
V, = 2V. + w?V, = Vs — 0. 

The hexagon A2342:A31A32A12413 is worth some attention. Its area is 
twice that of the triangle A,A,A; and in terms of its resolvents we find 
V;=V,—3V,—3V.,+V,;—0. To discover the geometrical significance 
of these conditions it is essential to express the resolvents of the hexagon in 


7 Morley, loc. cit., p. 207. 
*R. A. Johnson, Modern Geometry, p. 218. 


i 

| 


ON CERTAIN TYPES OF HEXAGONS. 505 


terms of the resolvents of its two component triangles. Thus, if we denote 
by u, and uw, the two Lagrange resolvents of the triangle A23A3:Ai2, and by 
w, and wu’, those for the triangle Az24:3;42: we can easily prove that the 
necessary and sufficient conditions for a positively-ordered hexagon to have 
V, = Vi— 3V, = 3V.+V;—0 are that the centroids of its component 
triangles coincide, that the vector u’, be negatively parallel to u, and half its 
length, and that the vector wu’, be negatively parallel to us and twice its length. 
If we denote by fi, fo; Fi, the Fermat points of the triangles 
Ay3A31A12, Ag2413401, respectively; and by hy, he; h’1, h’2; Hi, He 
the Hessian points of the same triangles, then the following facts can be 
verified—the three triangles have the same centroid; F coincides with f, and 
F, with f/1; 9, f:, f/1, Ae, h’2 and H, lie on a line and so do g, fo, f’s, hi, h’s 
and H,. The distances between these points can be readily read from the 
relations 
(2.3) 
g—h, =4(9g—H2); 


3. In this section, let us consider, in terms of their Lagrange resolvents, 
some special ordered six-points which possess features of interest. We shall 
first prove the theorem that 


The necessary and sufficient condition for a positively-ordered hexagon to have 
V,=V.=0 is that the sides of the triangle r,x¢r2 form positive right angles 
with the corresponding medians of the triangle x,2,;2; and equal 2.3-% times 
their length. 


Since V, — V. = 0, we have 


whence by addition, 


32, — 3g = — 22). 39 = 2,+2,;4+ 2; 
(3.1) or 


Similarly, we can show that 


(3. 2) — Lz 3%i (2, — g) 
and Le — = 3%i (2, — 


Which demonstrates the theorem. The conditions can easily be shown to be 
sufficient. Now if g be the centroid of the triangle z,47¢x2 one can show that 


e 
8 
8 
e 
d 
§ 
| 
d 
n 


J. R. MUSSELMAN. 


— = 3%1 — 9’) 
(3. 3) — = (4, — 9’) 

Ly, — = 31 — 9’) 
so that the sides of the triangle z,7,%, are perpendicular to the corresponding 
medians of the triangle 1.7%, and equal to 2.3-% times their length. Thus, 
we have a mutual relationship between the two triangles.® In addition, since 


Le Lo 3 Ly Zs g 


we see that the area of the triangle x,%,x2 is equivalent to that of 7,737,5. Also 
—W’, 

whence the vector f. — g is equal and positively parallel to f’2— g’, but f, —g 

is equal and negatively parallel to f’1— 9’; ete. 

From the formulae (3.1), (3.2), and (3.3), we read that if perpendicu- 
lars, dropped from the vertices x4, %, Z2 of a triangle to the corresponding 
sides of the triangle 2,23%5, should meet at the centroid of T4%eX2, then the 
perpendiculars from the vertices x1, £3, 5 to the sides of ry%—X_ will meet at 
the centroid of 212325. 

The necessary and sufficient condition for a positwely-ordered six-point 
to have Vs. is that the sides of the triangle form negative 
right angles with the corresponding medians of the triangle 27375 and equal 
2.3-4 times their length. This relationship is mutual and again both 
triangles are equivalent in area. If in addition V; 0, both centroids coin- 
cide, and both Fermat points f, and /’;, hence the diagonals of the hexagon 
meet at angles of 27/3. 


The necessary and sufficient condition for a positively-ordered six-point 
to have V2 = V,=—0 is that the midpoint of each of its diagonals should be 
the midpoint of the centroids of the triangles 2,237, and @4%o%2. The two 
triangles have their corresponding sides equal and parallel; they are inversely 
equivalent in area and also perspective. The opposite sides of the hexagon 
are equal and negatively parallel. Also 

h, —g =g—h;; h, —g =g—h’s. 


°If V; = V, = V,; = 0, we have the special case of the above, in which the centroids 
g and g’ coincide. See Morley, loc. cit., p. 214 for details. 


| 506 
| 


ON CERTAIN TYPES OF HEXAGONS. 507 


Hence, the join of f’, and f’; is parallel to f,; and f.; also the join of h’, and 
h’, to that of h, and hz If in addition Vz; 0, we can construct the com- 
ponent triangles as follows—starting with the triangle x,x,x2 with centroid 9’, 
then 2, lies on the median from 2, such that 247, = 224g’; similarly for 


8, and Zz. 

ce The necessary and sufficient condition for a positively ordered six-point 
to have V; = V; = 0 is that each diagonal shall be parallel and equal in length 
to the vector joining the centroids g and g’ of the two component triangles. 
These two triangles have their corresponding sides equal and parallel and are 

w: directly congruent. If in addition, we make V; — 0, then the two triangles 


will coincide throughout, and the hexagon is a doubly-counted triangle. 


4. Let A, (k =1,2,--+-,6) be any positively-ordered hexagon and 

let us construct the following six positive hexagons for which the Lagrange 

“9 resolvent V, vanishes, 
P’sAgA1A2A3A4 and P’gA,A2A;A4A5. The coordinates of the 
points P’; (t= 1,2,---+,6) are respectively a, + do, a3 —w?V3, 


Vita, ad; —wV, and w?V,-+ a. The equations of the six lines A;P’; are 
the 
at Vit Via Vid, — 0 


Via —wV,z wV 1 0 


Vik + Vids — QV, = 0 
V2 + w? Vids — = 0 
oth 
in- From the form of these equations, they represent three pairs of parallel lines, 
yon also each line makes a positively-directed angle of 27/3 with the consecutive 
line. The coordinates of the point of intersection of each line with the con- 
int secutive line are 
be : + Vi (a, — w*a,) (1 — 0?) V, 
two — 2) + Vi (a2 — w*a,) 
sely Ps: wV;(d2—ds) + Vi(as — + 
gon P,: Vi (Gs + Vi (a4 — w?a3) 
Ps: (ds —Gs) + Vi (as — w*a4) 
Pe: wV1(ds + Vi (de — 
If we call the join of the lines A,P’, and A;P”, by Bs; of AsP’; and A,P’s 
by Bi; of A,P’, and A,P’, by Bs; of and A2P’2 by By; of AP’, and 


by Be; of A,P’, and by B, then one can show that B,B;B, and 


2 
. 
J 
/ 


508 J. R. MUSSELMAN. 


B,B.B, are equal positive equilateral triangles with corresponding sides posi- 
tively parallel. Now the necessary and sufficient condition that the six points 
of intersection of two parallel equilateral triangles—sides produced if neces- 
sary—lie on a rectangular hyperbola *° is that the sides of the two triangles be 
equal. Consequently, the six points of intersection of the triangles B,B;B; 
and B,B.B,, which are the six points P; (1 = 1,2,---,6), lie on a rectangular 
hyperbola. In terms of the Lagrange resolvents this six-point is characterized 
by Vi = = V;V; —4V3V3 =0. Hence, the sides of the triangle P,P,P, 
make positive right angles with the corresponding medians of P,P;P; and are 
equal to 2.3-% times their length; also the Lagrange resolvent uw. of the 
triangle P,P;P; is three times the length of the join of the two centroids. 

Again, if A; be any positively-ordered hexagon and we construct the six 
positive hexagons for which the resolvent V; vanishes, 
we will obtain by a process similar to the above-mentioned one a six-point P; 
for which V,= V; = ViV,—4V;V;—=0. Hence, the sides of the triangle 
P,P,P. form negative right angles with the corresponding medians of P,P;P; 
and are equal to 2.3-% times their length, also the Lagrange resolvent w, of 
the triangle P,;P;P; is three times the length of the join of the two centroids. 
The points B,B,B; and B,B,B, are equal positive equilateral triangles with 
corresponding sides positively parallel and therefore the vertices of this hexagon 
lie on a rectangular hyperbola. Hence, associated with any positively-ordered 
hexagon are two circumscribed six-points, whose opposite sides are parallel and 
whose vertices lie on rectangular hyperbolas. 

However, if we construct the six positive hexagons P’;A.A3A4A5Ao,° °° for 
which the resolvent V. vanishes, we will obtain a six-point P; for which 
The points B,B;B; and B,B.B, are equal 
positive equilateral triangles with corresponding sides negatively parallel. 
Similarly, if we construct the six positive hexagons for which the resolvent V, 
vanishes, we will obtain a six-point for which V, = Vs = V;V;— 4VsVs 
=0. The points B,B,B, and B,B,B, are equal positive equilateral triangles 
with corresponding sides negatively parallel. Hence, associated with any 
positively-ordered hexagon are two circumscribed six-points whose opposite 
sides are equal and parallel, whose diagonals pass through a point, and whose 
vertices lie on conics. 


WESTERN RESERVE UNIVERSITY. 


7° J. R. Musselman, The American Mathematical Monthly, vol. 41 (1934), p. 634. 


ON THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE.* 


By HAssLtER WHITNEY. 


1. Introduction. Let C,,C.,---,Cn be the columns of a matrix M. 
Any subset of these columns is either linearly independent or linearly de- 
pendent; the subsets thus fall into two classes. These classes are not arbitrary ; 
for instance, the two following theorems must hold: 


(a) Any subset of an independent set is independent. 


(b) If Np and Np, are independent sets of p and p+ 1 columns respec- 
tively, then N, together with some column of N,,, forms an independent set 
of p+ 1 columns. 


There are other theorems not deducible from these; for in § 16 we give 
an example of a system satisfying these two theorems but not representing any 
matrix. Further theorems seem, however, to be quite difficult to find. Let 


“matroid.” ‘The present paper is 


us call a system obeying (a) and (b) a 
devoted to a study of the elementary properties of matroids. The fundamental 
question of completely characterizing systems which represent matrices is left 
unsolved. In place of the columns of a matrix we may equally well consider 
points or vectors in a Euclidean space, or polynomials, ete. 

This paper has a close connection with a paper by the author on linear 
graphs; * we say a subgraph of a graph is independent if it contains no circuit. 
Although graphs are, abstractly, a very small subclass of the class of matroids, 
(see the appendix), many of the simpler theorems on graphs, especially on 
non-separable and dual graphs, apply also to matroids. For this reason, we 
carry over various terms in the theory of graphs to the present theory. 
Remarkably enough, for matroids representing matrices, dual matroids have 
a simple geometrical interpretation quite different from that in the case of 
graphs (see § 13). 

The contents of the paper are as follows: In Part I, definitions of 
matroids in terms of the concepts rank, independence, bases, and circuits are 
considered, and their equivalence shown. Some common theorems are deduced 
(for instance Theorem 8). Non-separable and dual matroids are studied in 


* Presented to the American Mathematical Society, September, 1934. 
*“Non-separable and planar graphs,” Transactions of the American Mathematical 
Society, vol. 34 (1932), pp. 339-362. We refer to this paper as G. 


509 


510 HASSLER WHITNEY. 


Part I1; this section might replace much of the author’s paper G. The subject 
of Part III is the relation between matroids and matrices. In the appendix, 
we completely solve the problem of characterizing matrices of integers modulo 2, 
of interest in topology. 
I. Marrorps. 

2. Definitions in terms of rank. Let a set M of elements ¢,, @2,° - -, én 
be given. Corresponding to each subset N of these elements let there be a 
number r(NV), the rank of N. If the three following postulates are satisfied, 
we shall call this system a matroid. 


(R,) The rank of the null subset 1s zero. 
(R.) For any subset N and any element e not in N, 


r(N +e) =r(N) +k, (k =0or1). 
(R;) For any subset N and elements e;, e, not m N, tf r(N +e) 
=r(N then r(N +e, + =r(N). 


Evidently any subset of a matroid is a matroid. In what follows, M is a 
fixed matroid. We make the following definitions: 


p(N) number of elements in N. 
n(N) =p(N) —r(N) = nullity of N. 


N is independent, or, the elements of N are independent, if n(N) —0; 
otherwise, NV, and its set of elements, are dependent.. 


Lemma 1. For any N, r(N) 20 and n(N) 20. If NCM, then 
r(N) Sr(M), n(N) Sn(). 
LEMMA 2. Any subset of an independent set is independent. 


e is dependent on N if r(N +e) =r(N) ; otherwise ¢ is independent of N. 


A base is a maximal independent submatroid of M, i.e. a matroid B in 
M such that n(B) —0, while BC N, BAN implies n(N) > 0. See also 
Theorem 7. A base complement A = M— B is the complement in M of a 
base B. A circuit is a minimal dependent matroid, i.e. a matroid P such that 
n(P) > 0, while NC P, NAP implies n(N) = 0.8 


THEOREM 1. WN is independent if and only tf it is contained in a base, 
or, tf and only if it contains no circuit. 


* Compare G, Theorem 9. 


1 


THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE. 611 


THEOREM 2. A circuit is a minimal submatroid contained in no base, 
i.e. containing at least one element from each base complement. A base is a 
maximal submatroid containing no circuit. A base complement is a minimal 
submatroid containing at least one element from each circuit. 


The above facts follow at once from the definitions. Note the reciprocal 
relationship between circuits and base complements. Note also that the 
definitions of independence and of being a circuit depend only on the given 
subset, while the property of being a base depends on the relationship of the 
subset to M. 


3. Properties of rank. Our object here is to prove Theorem 3, The 
following definition will be useful : 


(3.1) A(M, N) =r(M +N) —r(M). 
Lemma 3. A(M + SA(M, 4). 


Suppose first + e,) =r(M) +1; thenr(M+e,+e.) =r(M) +h, 
k=1lor2. If k —2, then r(M + e.) =r(M) +1, on account of (R.), and 
the inequality holds; if k—1, r(M +e.) =r(M) +1, 10 or 1, and it 
holds again. If r(M +e.) —r(M)-+1, the same reasoning applies. If 
finally r(M + e,) =r(M + e.) =r(M), the inequality follows from (R;). 


Lemma 4. A(M+N,e) SA(M,e). 
If +----+ the last lemma gives 

A(M+ N,e) 
THEorREM 3. A(M-+N2,N,) A(M,N,), or, 


(3.2) +r(M 


r(M). 


This is true if N, contains but a single element. For the general case, 
we apply the last lemma and induction, setting VN, = N’ +e: 


A(M + N,) =A(M +N, + ¢, N’) + A(M + e) 
<A(M+.e,N’) +A(M,e) =A(M,N,). 


(3.2) is evidently equivalent to: 
(3.3) r(M, + M.) Sr(M,) + 


4. Deduction of (I,), (I.) from (R,), (R2), (Rs). The first postulate 
4 


512 HASSLER WHITNEY. 


on independent sets below obviously holds if (R,) and (R.) hold. To prove 
(I,), take NV, N’ as given there; then 
r(N) =p, r(N’) =p+1. 


We must show that for some i, A(N,¢;) =1. (Then e’; does not lie in VN.) 
If this is not so, then on using Lemma 4 we find 


1—r(N’) —r(N) SA(N, N’) 
= A(N,¢:) + A(N + +° +A(N + 
<A(N, + -+A(N, =0, 


a contradiction. 


5. Deduction of (C,), (C2) from (R,), (R2), (Rs). We shall need 
here a theorem showing how the nullity (or rank) of a matroid may be de- 


termined when we know what circuits it contains. 


LemMa 5. Each element of a circuit is dependent on the rest of the 


circuit. 


If e is an element of the circuit P, then n(P) =1, n(P—e) =0; 
hence r(P) = p(P) —1—p(P—e) =r(P—e). 


Lemma 6. If e is dependent on P, but on no proper subset of P,, then 
P=P,+e is a circuit. 


As A(P,,¢) =0, r(P) =r(P:) = p(P:) < p(P), n(P) >0, and P 
contains a circuit P’. If P’ does not contain e, take e’ in P’; then 


A(P,— ¢,e’) SA(P’ — ¢,e’) = 0, 


hence r(P, — e’) =r(P;), and 


A(P, — e,e) =r(Pi— +e) —r(P,— 
Sr(Pi +e) —r(P:) = A(Pi, e) = 0, 


and e is dependent on the proper subset P, — e’ of P;, a contradiction. There- 
fore P’ contains e. As P” is a circuit, e is dependent on the rest of P’; hence 


P’ = P. 


THEoREM 4. If ¢ is not in N, there is a circuit in N + e which contains 


e tf and only if e is dependent on N. 


f 
t 


THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE. 


Suppose P, + =P isacircuit, C N. Then 
A(N, e) A(P;, e) = 0, 


and e is dependent on N. Suppose, conversely, A(N,e) —0. Let P, be a 
smallest subset of N on which e is dependent; then by the last lemma, 
P=P,+¢eisacireuit. (It may be that P =e.) 


~ 


THEOREM 5. If N is formed element by element, then n(N) 1s just the 
number of times that adding an element increases the number of circuits 
present. 


Say N=e,+---+e,. Then if O is the null set, 


r(N) = A(O, + A(é1, + + eps, ep). 


Bach A(e, +: - - + e-1,e:) =0 or 1, and — 0 if and only if e; is dependent 
on +- i.e. if and only if there is a circuit in e, 4, 
containing e;. The number of terms is p—p(JV), and the theorem follows. 

We turn now to the proof of (C,) and (C.). The first is obvious. To 
prove the second, take P,, Ps, ¢:, ¢2 as given. As 


A(P; — é2, = A(P,— == (), 
we have 
A(P,; + P2— és, é2) =A(P,+ P.—e, 


C25 — 0. 


These equations give 


r(P, + P2—e, 


és) =r(P, + P2— =r(Pi + P2). 
Using (R.) gives 

r(P, + P2—e,) =r(Pi + P2— — 
hence the’ required circuit P, exists, by Theorem 4. 


6. Postulates for independent sets. Let M be a set of elements. Let 
any subset NV of M be either “independent ” or “dependent.” Let the two 
following postulates be satisfied : 


(I,) Any subset of an independent set is independent. 


(I.) If N=e,+-+-+e,and N’=e',+-- ++ ep are independent, 
then for some i such that e’; is not in N, N + é; is independent. 


513 


514 HASSLER WHITNEY. 


The resulting system is equivalent to a matroid, as we now show. Given 
any subset NV of M, we let r(N) be the number of elements in a largest 
independent subset of N. Obviously Postulates (R,) and (Rz) are satisfied; 
we must prove (R;). Say 


r(N +e) =r(N + e) =r(N) =r. 


Then r(N+¢,+¢e.) =r or r+1. If it equals r+1, there is an in- 
dependent set =e’, Let N” =e,” +---+e,” 
be an independent set in N. By (1.) there is an i such that N” + e’; is an 
independent set of r+ 1 elements. But N” + e’; lies in VN + e, orin VN + e, 
and hence r(N+e,) or r(N +e.) =r+1, a contradiction. Therefore 
r(N + ¢, + as required. 

We have shown how to deduce either set of postulates (R) or (I) from 
the other. Moreover the definitions of the rank and the independence or 
dependence of any subset of M agree under the two systems, and hence they 


are equivalent. 


7. Postulates for bases. Let W/ be a set of elements, and let each subset 
either be or not be a “ base.” We assume 


(B,) No proper subset of a base is a base. 


(B.) If B and B’ are bases and e is an element of B, then for some 
element e’ in B’, B—e+e’ is a base. 


We shall prove the equivalence of this system with the preceding one. 


We write here e,e,- - - instead of e, + e.-+- - - for short. 


THEOREM 6. All bases contain the same number of elements. 


For suppose 
B =—_ . Cplp+1 . . €qlq+1 . . Cr, 


are bases, with exactly ¢,,: - -,é in common, and r>q. We might have 
p=0. q> >p, on account of (B,). By (B.), we can replace e,,, in B by an 
element e’ of B’, giving a base B,. ¢ = ¢’;, is one of the elements é’y,1,° °°, 
for otherwise B, would be a proper subset of B. Hence 


/ 


If ¢g > p+1, we replace ép,. in B, by an element e’;, of B’, giving a base Bz 
Continuing in this manner, we obtain finally the base 


| 
| 
| 


THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE. 


J 


But this contains B’ as a proper subset, contradicting (B,). 

We shall say a subset of M is independent if it is contained in a base. 
(I,) obviously holds; we shall prove (I,). Let N, N’ be independent sets 
in the bases B, B’. Say 


, 
RB’ = . . p+1 é ql q+1 . é rer+1 . . 


44 
N == 61° * * * * Cg; N * * * * qu. 


Then WN and N’ have just in common, and B and B’ have just these 
elements and in common. By (B,), there is an element 
of B’ such that 

B, = B— ear + 


isa base. (This element cannot be any of - @p, @rs1,* by (B:)-) 
If i, is one of the numbers p+ 1,p+2,---,q+1, then VN + ej, is in a 
base B,, as required. Suppose not; then there is a base 


B, = By — eye +e ig 


with iA. If N+ 4%, is in a base If not, we 
find a base B;, etc. We can drop out each of the r—q elements ég41,° * *, r 
in turn; as there are only r—gq—1 elements e’; with 1 > g + 1, we find at 
some point a base containing -,ég, with p+1SjSq+1. Then 
é; isin N’, and N + e’; is in a base and is thus independent, as required. 

The definitions of base and independent sets in the two systems (I) and 
(B) are easily seen to agree. Suppose (I,) and (I.) hold. (B,) obviously 
holds; using (I.), we prove that all bases contain the same number of ele- 
ments; (B.2) now follows at once from (I,). Hence the two systems are 
equivalent. 


THEOREM 7. B is a base in M tf and only if 
r(B) =r(M), n(B) =0. 


Evidently, B is a base under the given conditions. To prove the converse, 
we note first that there exists a base with r(M) elements, as r(M) is the 
maximum number of independent elements in M (see $6). By Theorem 6, 
all bases have this many elements, and the equations follow. 


TurorEM 8. If B is a base and N is independent, then for some N’ in 
B,N +N’ is a base. 


515 


516 HASSLER WHITNEY. 


This follows from repeated application of Postulate (I.) and the last 
theorem. 


8. Postulates for circuits. Let M be a set of elements, and let each 
subset either be or not be a “ circuit.” We assume: 


(C,) No proper subset of a circuit is a circuit. 


(C.) If P, and Pz are circuits, e; is in both P, and and ez is in P, 
but not in P., then there is a circwt P; in P, + P2 containing ez but not e. 


(C.) may be phrased as follows: If the circuits P,; and P, have the 
common element e¢, then P, + P,—e is the union of a set of circuits. 

We shall define the rank of any subset of M, and shall then show that 
the postulates for rank are satisfied. Let e:,- - -,¢p be any ordered set of 
elements of M. Set —0 if there is a circuit in - - + e containing 
é;, and set [Tj —1 otherwise (compare Theorem 5). Let the “rank” of 


Dp 
r(é, €p) > Tj. 
i=1 


To prove this, let N be the ordered set ¢,,- - -, @g-2, and set 
r(N) =r, r(N, eg-1) = 11; r(N, eq) 
r(N, €g-1, €q) = 112, r(N, eq, @q-1) = Te 


CasE 1. There is no circuit in N + eg, containing eg, and none in 
N + eq containing eg. Then 
=r+1. 


If there is a circuit in N + eg, + eg containing eg, and ég, then 


otherwise, 
+1 


CasE 2. There is a circuit P, in N + eg containing eg, and a circuit 
P, in N + eg, + eg containing eg, and ¢. Then, by (C2), there is a circuit 
P; in N + eg containing eg. Hence 


12 = To = 


= 


THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE. 517 


Case 3. There is a circuit P, as above, but no circuit P, as above. If 
there is a circuit P; as above, the last set of equations hold. Otherwise, 


Case 4. There is a circuit in N + eg containing ég. This case overlaps 
the two preceding ones; the proof above applies here also. 


Lemma 8. The rank of any subset. N 1s independent of the ordering 
of the elements of N. 


We saw above that interchanging the last two elements of any subset does 
not alter the rank; hence, evidently, interchanging any two adjacent elements 
leaves the rank unchanged. Any ordering of M may be obtained from any 
other by a number of interchanges of adjacent elements; the rank remains 
unchanged at each step, proving the lemma. 

Postulates (R,) and (Rez) are obviously satisfied. To prove (R3), 
suppose r(NV + e,) =r(N +e.) =r(N). Then there is a circuit in N + e 
containing e, and one in N + e, containing hencer(N + e, + 

The definitions of rank and of circuits under *he two systems (R), 
(C) agree, and hence the systems are equivalent. 


9. Fundamental sets of circuits. The circuits P;,- - -, Py of a matroid 
M form a fundamental set of circuits if g = n(M) and the elements ¢,---, én 
of M can be ordered so that P; contains é@n-g,i but no @nq.j (7 >1). The set 
is strict if P; contains but no (0<j<tiorj >t). These sets 
may be called sets with respect to @n-qs1,° * * 3 @n- 


THEOREM 9. Jf B=e,+:--:+ en¢ 1s a base in eq, 
then there is a strict fundamental set of circuits with respect to en-qi1y* * * 5 €n3 
these circuits are uniquely determined. 


As r(B) =r(M), A(B,e:) =0 (ti Hence, by 
Theorem 4, there is a circuit P; containing e; and elements (possibly) of B. 
Pau,’ * *,Pn is the required set. Suppose, for a given i, there were also a 
circuit P’; A P;. Then Postulate (C.) applied to P; and P’; would give us 

acircuit P in B, which is impossible. 

This theorem corresponds to the theorem that if a square submatrix N 
of a matrix M is non-singular, then N can be turned into the unit matrix 
by a linear transformation on the rows of M. 


THeorEM 10. If P,,---,P, form a fundamental set of circuits with 


518 HASSLER WHITNEY. 


with 


respect tO * then there is a unique strict set P’s,- - 
respect tO * 5 Cn 


Set B= M — en). The existence of P,,- - -, Pq shows 
that r(M)=—1r(M —en)=-: -=—r(B). Hence p(B)=n—q—r(M)—r(B), 
and B is a base, by Theorem 7. Theorem 9 now applies. 

Note that a matroid is not uniquely determined by a fundamental set 

‘ of circuits (but see the appendix). This is shown by the following two 
matroids, in each of which the first two circuits form a strict fundamental set: 


M, with circuits 1234, 1256, 3456; 
M’, with circuits 1234, 1256, 13456, 23456. 


II. SEPARABILITY, DuAL MATROIDS. 


10. Separable matroids. If M—M,-+ thenr(M)=r(M,)+7(M,), 
on account of (3.3). If it is possible to divide the elements of M into two 
groups, M, and Mz, each containing at least one element, such that 


(10. 1) r(M) =r(M,) + | 
or, which is equivalent (as M, and M, have no common elements), 
(10. 2) n(M) =n(M,) + 


we shall say M is separable; otherwise, M is non-separable.* Any single 
element forms a non-separable matroid. Any maximal non-separable part of 
M is a component of M.® 


THEOREM 11. If 


M=M,+M,, 1(M)—r(M.) +r(M.), 
M,.CM, %M.CM, M 


r(M’) =r(M’,) +7(M2). 


Set M,” M, M,’, =M,—M.,/. The relations (see Theorem 3) 
r(M) = A(M, + M,”) + A(M’, + r(M”’) 

= A(M,’, M,”) + A(My, My”) + 

r(Mz) —r(M,’) + r(M;) —r(M,’) r(M’) 


*Compare G, Theorem 15. 
5 See G, § 4. 


then 


THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE. 519 


together with the fact that r(M) =—r(M,)+7r(M.) show that r(M’) 
=r(M’,;) +r(M’.) and hence r(M’) =r(M’;) + 7r(M’2). 


THEOREM 12.° If M=M,+ M2, r(M) + M’ ts non- 
separable, and M’ C M, then either M’ C M, or M’C 


For suppose M’ = M’,-+ M’,, M’,;C M’,C M2, and M’; and M’, 
each contain an element. By the last theorem, r(M’) —r(M’,) +r(M’), 
which cannot be. 


THEOREM 13. If M, and Mz are non-separable matroids with a common 
element e, then M = M, + Mz is non-separable. 


For suppose M = M’,+ M’,, r(M) =r(M’,) By the last 
theorem, M, C M’, or M, C M’,, and M, C M’, or Mz C M’;; this shows that 
either M’, or M’, is void. 


THEOREM 14. No two distinct components of M have common elements, 
This is a consequence of the last theorem. From this follows: 


THEOREM 15." Any matroid may be expressed as a sum of components 
ma unique manner. 


THEOREM 16.° A non-separable matroid M of nullity 1 1s a circuit, and 
conversely. 


If M, is a proper non-null subset of the non-separable matroid M, and 
M, = M — M,, then r(M) < r(M,) + 7r(M.). Hence 


1—n(M) > n(M,) + 


and n(M,) —0, proving that M is a circuit. 


Conversely, if M = M,-+ Mz, is a circuit, and M, and M, each contain 
elements, then 


+ r(Mz) = + p(M2) —n(M,) —n(Mz) 
—p(M) >r(M), 


showing that M is non-separable. 


*Compare G, Lemma, p. 344. 
"Compare G, Theorem 12. 
Compare G, Theorem 10. 


HASSLER WHITNEY. 


Lemma 9. Let M=—M,+M, be non-separable, and let M, and M, 
each contain elements but have no common elements. Then there is a circuit 
P in M containing elements of both M, and M2. 


Suppose there were no such circuit. Say Mz—e,+---++ és. Using 
Theorem 4, we see that 


and hence r(M) =r(M,) + a contradiction. 


THEOREM 17.° Any non-separable matroid M of nullity n> 0 can be 
built up in the following manner: Take a circuit M,; add a set of elements 
which forms a circuit with one or more elements of M;, forming a non- 
separable matroid M, of nullity 2 (if n(M) >1); repeat this process till 
we have M, = M. 


As n> 0, M contains a circuit M,. If n>1, we use the preceding 
lemma n —1 times. The matroid at each step is non-separable, by Theorems 
16 and 13. 


THEOREM 18.2° Let My, and let M,,- - -, My be non- 
separable. Then the following statements are equivalent: 


(1) M,,- are the components of M. 


(2) No two of the matroids M,,---,M, have common elements, and 
there is no circuit in M containing elements of more than one of them. 


(3) r(M) +: + 


We cannot replace rank by nullity in (3); see G, p. 347. 
(2) follows from (1) on application of Theorems 13 and 16. 


To prove (1) from (2), take any M;. If it is not a component of M, 
there is a larger non-separable submatroid M’; of M containing it. By Lemma 
9, there is a circuit P in M’,; containing elements of M; and elements not in 
M;; P must contain elements of some other M;, a contradiction. 

Next we prove (3) from (1). Ifp> 1, Misseparable; say M = M’, + M’,, 
r(M) =r(M’,) +7r(M’.). By Theorem 12, each M; is in either M’, or M’2; 
hence M’, and M’, are each a sum of components of M. If one of these 


*See G, Theorem 19; also Whitney, “2-isomorphic graphs,” American Journal 
of Mathematics, vol. 55 (1933), p. 247, footnote. 
1°Compare G, Theorem 17. 


520 
H 
| 
| 
i 
| 


THB ABSTRACT PROPERTIES OF LINEAR DEPENDENCE. 521 


contains more than one component, we separate it similarly, etc. (3) now 
follows easily. 

Finally we prove (1) from (3). Let M’ be a component of M, and 
suppose it has an element in M;. As 


r(M) = + 


M’ is contained in M;, by Theorem 12; as M; is non-separable, M’ = M,. 


THEOREM 19.1" The elements e, and e, are in the same component of M 
if and only if they are contained in a circwt P. 


If e, and e. are both in P, they are part of a non-separable matroid, 
which lies in a single component of M. Suppose now e, and é, are in the 
same component M, of M, and suppose there is no circuit containing them 
both. Let M, be e, plus all elements which are contained in a circuit con- 
taining e,. By Lemma 9, there is a subset M* of M,— M, which forms with 
part of M, a circuit P;. Ps; does not contain e;. If e’, is an element of P; 
in M,, there is a circuit P,; in M, containing e, and e’,. Let es be an element 
of M*. Then in MV, + M* there are circuits P, and P; which contain e, and 
és respectively, and have a common element. 

Let M’ be a smallest subset of M, which contains circuits P’,; and P’; 
such that one contains ¢,, the other contains e;, and they have common ele- 
ments. Then P’, and P’; are distinct, and M’—P’,+ P’;. Let e, be a 
common element. By Postulate (C.), there is a circuit P, in M’ — e, con- 
taining e,, and a circuit P; in M’ —e, containing e;. By the definition of 
M’, P, and P; have no common elements. By Postulate (C,), P; is not con- 
tained in P’,; hence it contains an element e, of M’—P’,. P; does not 
contain es. As P; is not contained in P”;, it contains an element é, of P’;. 
But now P’, contains e,, P; contains e,, P’; + P; have a common element ég, 
and P’; + Ps; does not contain e, and is thus a proper subset of M’, a contra- 
diction. This proves the theorem. 


11. Dual matroids. Suppose there is a 1—1 correspondence between 
the elements of the matroids M and M’, such that if N is any submatroid of 
_M and N’ is the complement of the corresponding matroid of M’, then 


(11.1) r(N’) =r(M’) —n(N). 
% Compare D. Kénig, Acta Litterarum ac Scientiarum Szeged, vol. 6, pp. 155-179, 
4. (p. 159). The present theorem shows that a “ glied” is the same as a component. 


522 HASSLER WHITNEY. 


We say then that M’ is a dual of M.* 
THEOREM 20. Jf M’ is a dual of M, then 
r(M’)=n(M), n(M’) =r(M). 


Set NM; then n(N) —n(M). In this case N’ is the null matroid, 
and r(N’) =0. (11.1) now gives r(M’) —n(M). Also 


n(M’) = p(M’) —r(M’) = p(M) —n(M) 
THEOREM 21. If M’ is a dual of M, then M is a dual of M’. 


Take any N and corresponding N’ as before. The equations 


r(N’) r(M’) =n(), 
+ p(N’) = p(M) 
give 
r(N) =p(N) —n(N) = p(N) — [r(M’) —7(N’)] 
=p(N) —n(M) + [p(N’) —n(N’)] 
= p(M) —n(M) —n(N’) =r(M) —n(N’), 
as required. 


THEOREM 22. Every matroid has a dual. 


This is in marked contrast to the case of graphs, for only a planar graph 
has a dual graph (see G, Theorem 29). 

Let M’ be a set of elements in 1— correspondence with elements of M. 
If N’ is any subset of M’, let N be the complement of the corresponding subset 
of M, and set r(N’) =n(M)—n(N). (Ri), (Rz), (Rs) are easily seen to 
hold in M’, as they hold in M ; hence M’ is a matroid. Obviously r(M’) = n(M), 
and M’ is a dual of M. 


THEOREM 23. M and M’ are duals if and only if there is a 1—1 
correspondence between their elements such that bases in one correspond to 
base complements in the other. 


Suppose first M and M’ are duals. Let B be a base in either matroid, 
say in M, and let B’ be the complement of the corresponding submatroid of the 
other matroid, M’. Then 


8 Compare G, § 8. Theorems 20, 21, 24, 25 correspond to Theorems 20, 21, 23, 25 in G. 
Note that two duals of the same matroid are isomorphic, that is, there is a 1—] 
correspondence between their elements such that corresponding subsets have the same 
Tank. Such a statement cannot be made about graphs. Compare H. Whitney, “ 2-is0 
morphic graphs,” American Journal of Mathematics, vol. 55 (1933), pp. 245-254. 


i 

| 


THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE. 


r(B’) =r(M’) —n(B) =r(M’), 
n(B’) =r(M) —r(B) =0, 


and B’ is a base in M’, by Theorem 7. 

Suppose, conversely, that bases in one correspond to base complements in 
the other. Let NV be a submatroid of M and let N’ be the complement of the 
corresponding submatroid of M’. There is a base B’ in M’ with r(N’) ele- 
ments in NV’, by Theorem 8. The complement in M of the submatroid corre- 
sponding to B’ in M’ is a base B in M with p(N’) —r(N’) = n(N’) elements 
in M —N, and hence with r(M) —n(N’) elements in N. This shows that 


r(N) =r(M) —n(N’) k = 0. 
In a similar fashion we see that 

r(N’) =r(M’) —n(N) + F, 0. 
As B contains r(M) elements and B’ contains r(M’) elements, r(M) + r(M’) 
=p(M). Hence, adding the above equations, 

k+h’=r(N) +7r(N’) + 2(N) + 2(N’) —r(M) 
—p(¥) + p(N’) — —0. 

Hence k = 0, and the first equation above shows that M and M’ are duals. 


There are various other ways of stating conditions on certain submatroids 
of M and M’ which will ensure these matroids being duals.1* 


THEOREM 24. Let M,,---,Mp and M’,,---,M’, be the components 
of M and M’ respectively, and let M’; be a dual of M; (i—1,---,p). Then 
M’ is a dual of M. 


Let N be any submatroid of M, and let the parts of N in M,,:--,M, 
be Ni,---,N,. Let N’; be the complement in M’; of the submatroid corre- 
sponding to N;; then N’ = N’; +----+ N’, is the complement in M’ of the 
submatroid corresponding to N. By Theorems 18 and 11 we have 


Also 
r(M’) =r(M’,) r(N%s) = — ; 


' adding the last set of equations gives r(N’) =r(M’) —n(N), as required. 


* See for instance a paper by the author “Planar graphs,” Fundamenta Mathe- 
maticwe, vol. 21 (1933), pp. 73-84, Theorem 2. Cut sets may of course be defined in 
terms of rank. 


523 


524 HASSLER WHITNEY. 


THEOREM 25. Let M and M’ be duals, and let M,,: - -, Mp be the com- 
ponents of M. Let M’,,---,M’, be the corresponding submatroids of M’. 
Then M’,,- --,M’, are the components of M’, and M’;, is a dual of M; 
(1—1,- 


The complement in M of the submatroid corresponding to M’; in M’ 
is >} M;. Hence, as M and M’ are duals and the M; (j 7) are the com- 
j 


ponents of > M; (see Theorem 18), 


r(M’;) —r(M’) —n( —r(M’) 
Adding gives 
— pr(M’) — (p—1)r(M’) = 


Therefore, by Theorem 12, each component of M’ is contained in some M’;. 
In the same way we see that each component of M is contained in a matroid 
corresponding to a component of M’; hence the components of one matroid 


correspond exactly to the components of the other. 
Let NV; be any submatroid of M;, and let N’ and N’; be the complements 
in M’ and M’; of the submatroid corresponding to N;. The equations 


r(M’) r(N’) =r(N%) +2), 
r(N’) =r(M’) —n(Mi), 
give 
r(N’;) =r(M’;) —n(Ni), 
which shows that M’; is a dual of M,. 


THEOREM 26. A dual of a non-separable matroid is non-separable. 


This is a consequence of the last theorem. 


III. Matrices MATROIDs. 


12. Matrices, matroids, and hyperplanes. Consider the matrix 


| 
| 
°° Ain 
Om1* * *Omn 


THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE. 525 


let its columns be (,,--+,Cn. Any subset N of these columns forms a 
matrix, and this matrix has a rank, r(N). If we consider the columns as 
abstract elements, we have a matroid M. The proof of this is simple if we 
consider the rank of a matrix as the number of linearly independent columns 
in it. (R,) and (R.) are then obvious. To prove (R;), suppose r(N + C,) 
=r(N+C.) =r(N); then C; and C2 can each be expressed as a linear 
combination of the other columns of N, and hence r(N+ C, + C2) =r(N). 
The terms independent and base carry over to matrices and agree with the 
ordinary definitions; a base in M is a minimal set of columns in terms of 
which all remaining columns of M may be expressed. 

We may interpret M geometrically in two different ways; the second is 
the more interesting for our purposes: 

(a) Let Hm be Euclidean space of m dimensions. Corresponding to each 
column of M there is a point X; in with codrdinates ami. The 
subset Ci,,- - -, Ci, of M is linearly independent if and only if the points 
Xi,,° +,Xi, are linearly independent in Em, i.e. if and 
only if these p-+ 1 points determine a hyperplane in Hm of dimension p. 
A base in M corresponds to a minimal set of points Xj,,- - -, Xi, in Hm such 
that each X; of M lies in the hyperplane determined by O, Xi,,---, Xi,. Then 
pis the rank of M. 

(b) Let #, be Euclidean space of n dimensions. Let R,,- - -, Rm be the 
rows of M. If Y,,---, Ym are the corresponding points of Fn: Y; =(di1,°**, din), 
then the points O, Y,,- - +, ¥m determine a hyperplane H = H(M), which 
we shall call the hyperplane associated with M. The dimension d(H) of H 
isr(M). Let N=Ci,+---+ Ci, be a subset of M, and let EH’ be the 
p-dimensional codrdinate subspace of F, containing the 2;, and ... and the 
z;, axes. The j-th row of N corresponds to the point Y’; in H’ with codrdinates 
(aji,° ° -,@;i,); this is just the projection of Y; onto H’. If H’ is the 
hyperplane in H’ determined by the points O, Y’;,- - -, ¥’m, then H’ is exactly 
the projection of H onto E’, and 


(12. 1) d(H’) =r(N). 


Let N = (C;i,,- - -,Ci,) be any subset of M, and let EL’, H’ correspond 
toN. Then N is independent if and only if 


d(H’) =p, 


and 1s a base if and only if 


d(H’) = d(H) =p. 


526 HASSLER WHITNEY. 


THEOREM 27. There is a unique matroid M associated with any hyper- 
plane H through the origin in En. 


Let M contain the elements ¢,,- - -, én, one corresponding to each codrdi- 
nate of E,. Given any subset ¢i,,- - *, ¢,, we let its rank be the dimension 
of the projection of H onto the corresponding coédrdinate hyperplane L’ of £,. 
It was seen above that if M is any matrix determining H, then M is the 
matroid associated with M. 


13. Orthogonal hyperplanes and dual matroids. We prove the fol- 
lowing theorem : 


THEOREM 28. Let H be a hyperplane through the origin in En, of di- 
mension r, and let H’ be the orthogonal hyperplane through the origin, of 
dimension n —r. Let M and M’ be the a.sociated matroids. Then M and M’ 
are duals. 


We shall show that bases in one matroid correspond to base complements 
in the other; Theorem 23:then applies. Let 


* * Ain °° 
— 
Ary" * * Arn chi 


be matrices determining H and H’ respectively. Say the first r columns of M 
form a base in M, i.e. the corresponding determinant A is 40. As H and H’ 
are orthogonal, we have for each 1 and 7 


+ ++ + Aindjn = 0. 


Keeping j fixed, we have a set of r linear equations in the bj. Transpose 
the last »—vr terms in each equation to the other side, and solve for bj. 
We find 


l=r+1 Ort’? Ore l=r+1 
This is true for each j—1,---,n—vr, and the cy: are independent of j. 
Thus the k-th column of M’ is expressed in terms of the last n —r columns. 
As this is true for k =1,- - -,7, the last n —r columns form a base in M’, 


as required. 


14. The circuit matrix of a given matrix. Consider the matrix M of 


§ 12. Suppose the columns Cj,,- - -, Ci, form a circuit, i. e. the corresponding 


j 
iq 
¢ 
4 ( 
| 
| 
| 
q 


THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE. 527 


elements of the corresponding matroid form a circuit. Then these columns 
are linearly dependent, and there are numbers 0,,- - - ,b» such that 


( ) b; =0 (JA tp), ~0 tp). 


The 6; are all ~O0 (j =%t,---,%), for otherwise a proper subset of the 
columns would be dependent, contrary to the definition of a circuit. (They 
are uniquely determined except for a constant factor; see Lemma 11.) Sup- 


pose the circuits of M are P;,- - -, Ps. Then there are corresponding sets of 
numbers -, bin forming a matrix 
* Din 
* Den 


the circuit matrix of the matrix M. 


THEOREM 29. Let P,,---,Pq be a fundamental set of circuits in M 
(see § 9). Then the corresponding rows of the circuit matrix M’ form a base 
for the rows of M’. Hence r(M’) =q=—n(M). 


Suppose the columns of M are ordered so that P; contains Cn_q,i 
but no column Cy¢,; (j >+%). Then if the corresponding row of M’ is 
BY, = +, Din), we have 0 and =0 (j >%).. Hence 
the rows R’,,- - -, R’g of M’ are linearly independent, and r(M’) = q. Hence 


r(M’) =n(M) —q, and each row of M’ may be expressed in terms of 
Re. 


THEOREM 30. Jf M’ is the circwt matrix of M and H’, H are the 
corresponding hyperplanes, then H’ ts the hyperplane of maximum dimension 
orthogonal to H. 


This is a consequence of (14.1) and the last theorem. 


THEOREM 31. The matroids corresponding to a matriz and its circutt 
matric are duals. 


This follows from the last theorem and Theorem 28. 


15. On the structure of a circuit matrix. Let M be any matroid, 
and M’, its dual. If there exists a matrix M corresponding to M, it is perhaps 
most easily constructed by considering it as the circuit matrix of a matrix M’ 


5 


j. 
of 
ng | 


528 HASSLER WHITNEY. 


corresponding to M’. Let H and H’ be the hyperplanes corresponding to M 
and M’. We shall say the set of numbers (a,,- -,dn) is in 


aj =0 * tp). 


0 


If (a,,° is in H and in Z;,...;,, then the columns (Cj,,- - -, C;, of M’ 
are dependent, evidently. 


Lemma 10. Let (b,,:--,bn) be a point of H. If it is in Zi,...4,, then 
the matroid N’ =e, -+-- -+ ¢;, is the union of a set of circuits in M’. 


Here ¢; in M’ corresponds to C; in M. We need merely show that for 
each 1, there is a circuit P in N’ containing e;,. Let k, = 1s, k2,- bea 
minimal set of numbers from (i,,- - -, iy) containing 1, such that there is a 
point of H in Z,...%,; then is the required 
circuit. For if it were not a circuit, there would be a proper subset (1,,-- - , /,) 
of (k,,- +,kq) and a point (d;,---+,dn) of H in Zi,...1, Nol, on 
account of the minimal property of (k1,---,kg). Say 1, =k, and set 


Ay = dy,Ci — Cu, 


Then is in H and in Zm,...m, With (m,,° my) a proper 
subset of -,kq) containing k,, again a contradiction. 


Lemma 11. P=e,+---+ 4, is a circuit of M’ and (b,,- - -, bn) 
and (b’,,:--,b’n) are in H and in Z;,... 4,, then these two sets are proportional. 


For otherwise, (¢:,- -,¢n) with = would be a point 
of H in some Z,,...x, With (k:,- -,kq) a proper subset of (%,,- - -,%), and 
P would not be a circuit. 

It is instructive to show directly that Postulate (C.) holds for matrices: 
P, and P, are represented by rows (b,,---,bn) and (b’;,:--,b’n) of M, 
lying in Zioi,...4, and Zx,...%, respectively, where k,,---,kg72. Set 
c, = b’,b; —b,b’;; then cn) is in H and in Zo1,...1,, with 
a subset of -,%p, *,kq); the existence of P; now follows from 
Lemma 10. 


THEOREM 32. Let M be the circuit matriz of M’. Let P,,- - -, Pq form 
a strict fundamental set of circuits in M’ with respect to en-qu,* °°, €n, and 
let the first q rows in M correspond to P,,---, Pq. Let +, be any 
set of numbers from (1,:--,q), let (j1,° je) be any set from (1,---,n—4); 
and let be the set complementary to in (1,°°*,4)- 


i 
| 
i 
| (t= 1,---,n). 
| 
i} 


THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE. 529 


Then the determinant D in M with rows and columns j,,° 
equals zero if and only if the determinant D’ with rows 1,- - -,q and columns 
jis’ equals zero, or, if and only if 
there exists a circuit P in M’ containing none of the columns ¢éj,,° , 


In the matrix of the last g =r(M) columns of M, the terms along the 
main diagonal and only those are #0. If we expand D’ by Laplace’s .ex- 
pansion in terms of the columns n—q+71,-::,n—q-+1qs, we see at 
once that D’ = 0 if and only if D0. 

Suppose D0. Then there is a set of numbers (@,,- - -,@), not all 
zero, With a =0 such that 


by = ++ = 0 (k=ji,° Je), 


(bi, +, bin) being the i-th row of M., = 0 also fork 
n—q-+g-s, as each term is zero for such k. The point (b1,---+,bn) is 
in H. Any circuit given by Lemma 10 is the required circuit P. 

Suppose the circuit P exists. Then it is represented by a row (},,°°-, bn) 
in M. As the first g rows of M are of rank g —r(M), (b1,- - -,bn) can be 
expressed in terms of them; say by = = 0 (kK 
n—q-+q-2), certainly —0 now follows from 


the fact that =0 (kK =jy,- js). 


16. A matroid with no corresponding matrix.‘* The matroid M’ has 
seven elements, which we name 1,---,7%. The bases consist of all sets of 
three elements except 


(16. 1) 124, 185, 16%, 236, 257, 347, 456. 


Defining rank in terms of bases, we have: Each set of & elements is of rank 
k if k = 2 and of rank 3 if k= 4; a set of three elements is of rank 2 if the 
set is in (16.1) and is of rank 3 otherwise. It is easy to see that the postu- 
lates for rank are satisfied. (R;) in the case that N contains two elements is 
satisfied vacuously. For suppose r(NV + ¢,) =r(N + e.) =r(N) =2. Then 
N +e, and N + e, are both in (16.1); but any two of these sets have but 
a single element in common. 


* After the author had noted that M’ satisfies (C*) and corresponds to no linear 
graph, and had discovered a matroid with nine elements corresponding to no matrix, 
Saunders MacLane found that M’ corresponds to no matrix, and is a well known 
example of a finite projective geometry (see O. Veblen and J. W. Young, Projective 
Geometry, pp. 3-5). 


r 
a 
) 
n 
T 

) 
4 
it 
d 

et 
) 
m 
m 
d 
1). 


530 HASSLER WHITNEY. 


If there exists a matrix M’, corresponding to M’, then let M be its circuit 
matrix. 123 is a base in M’, and hence 


(16. 2) 124, 135, 236, 1237 


form a fundamental set of circuits in M’. Let R,, R2,-R;, Ry, be the corre- 
sponding rows of M. By multiplying in succession row 1, column 2, rows 
2, 3, 4, and columns 4, 5, 6, 7 by suitable constants 40, we bring M into 
the following form: 


0 0 
10a 
(16. 3) 16} 001 


a,b,c and dare ~0. We now apply Theorem 32 with 
f = (1,4; 1,2), (2,4; 1,3), (3, 4; 2, 3), 
i.e. using the circuits 347, 257, 167. This gives 


1 
d 


= 0, 


1 d 


and hencec = 1,a—d=—b. Using the circuit 456, with sets (1, 2, 3; 1, 2,3) 
gives 2a = 0, a = 0, a contradiction. 
In regard to this example, see the end of the paper. 


APPENDIX. 


MATRICES OF INTEGERS MOD 2. 


We wish to characterize those matroids M corresponding to matrices M 
of integers mod 2,° i.e. matrices whose elements are all 0 or 1, where rank 
etc. is defined mod 2. We shall consider linear combinations, chains: 


(A. 1) + (a’s integers mod 2) 


in the elements of M. The a’s may be taken as 0 or 1; (A.1) may then be 
interpreted as the submatroid N whose elements have the coefficient 1. Con- 
versely, any N CM may be written as a chain. Submatroids are added 


48 See O. Veblen, “ Analysis situs,” 2nd ed., American Mathematical Society Collo- 
quium Publications, Ch. I and Appendix 2. 


| 
= 
| 
| 


THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE. 531 


(mod?) by adding the corresponding chains (mod2). For instance, 
+ + (€2 + =e; + es (mod 2). 

Any sum (mod 2) of circuits in M we shall call a cycle in M. WN is the 
true sum of N,,---:,N, if these latter have no common elements and 
N=N,+:--+ Ns. We: consider matroids which satisfy the following 
postulate : 


(C*) Hach cycle is a true sum of circuits. 


Postulate (C.) is a consequence of (C*). For the cycle P, + P2 is a 
submatroid containing e, but not e,; The existence of P; now follows 
from (C*). 

A simple example of a matroid not satisfying (C*) is given by the 
matroid M’ at the end of § 9. 


THEOREM 33. A circuit ts a minimal non-null cycle, and conversely. 
This is proved with the aid of Postulates (C,) and (C*). 


THeorEM 34. Let P,,- --,Pq be a strict fundamental set of circuits 
in M with respect to @n-qi,’**,@n. Then there are exactly 24 cycles in M, 
formed by taking all sums (mod 2) of Pg. 


First, each sum Pi, Pi, (mod 2) is a cycle, containing 
5 aNd elements (perhaps) from B= - -, obviously 
distinct sums give distinct cycles. Now let @ be any cycle in M; say 
Q contains ° and elements (perhaps) from B. Set 
Q’=Py,+:--+Px,; then Q+ Q’ is a cycle containing elements from B 
alone. But B is a base (see the proof of Theorem 10), and hence contains 
no circuits. Consequently Q + Q’ is the null cycle, and Q = Q’. 


THEOREM 35. As soon as the circuits of a strict fundamental set are 
known, all the circuits may be determined. 


This is a consequence of the last two theorems. It is to be contrasted 
with the final remark of § 9. 


Remark. The word “strict” may be omitted in the last two theorems. 


THEOREM 36. Let e,,: -*-, én be a set of elements, and let P,,---+,Pa 
be any subsets such that P; contains en-qui and possibly elements from 
€1,° * *,€n-q alone. Then there is a unique matroid M satisfying (C*), with 
Pi: ++, Pq as a strict fundamental set of circuits. 


| 


HASSLER WHITNEY. 


We form the 2% cycles of Theorem 34. Those cycles which contain no 
other non-null cycle as a proper subset we call circuits; in particular, 
P,,: - +, Pq are circuits. To prove (C*), let Q be a non-null cycle. If it 
is not a circuit, it contains a circuit P as a proper subset. Q and P are 
sums (mod 2) from P;,:--,Pg, hence the same is true of Q—P, and 
Q —P is one of the 2% cycles. If it is not a circuit, we again extract a 
circuit, etc. 

This theorem furnishes a simple method of constructing all matroids 
satisfying (C*). 

We turn now to the study of matrices of integers (mod 2) 


° * * Ain 
(each aij = 0 or 1). 
* &mn 


Any linear combination (mod 2) of the columns 
(A. 2) 0, (a’s integers mod 2) 


is a set of numbers (3a , which we call a chain (mod 2) in M. 
As before, we may take each coefficient as 0 or 1, and we may consider any 
chain merely as a submatrix of M. The chain is a cycle if each of the corre- 
sponding numbers is =0 (mod 2). The columns Cj,,- --,Ci, are inde- 
pendent (mod 2) if there exists no set of integers a,- - -, % not all==0 (mod 2), 
with a, —0 (154%,- + -,%), such that Sa,C; is a cycle, i.e. if no non-null 


subset of C;,,- - -, Ci, is acycle. Using this definition, the terms base, circuit, 
rank, nullity etc. (mod 2) can be defined as in Part I. 
Let M be a set of elements ¢,,- - - , &, corresponding to C,,- - -,C, in M, 


and let e,,+-°---+ e, be a circuit in M if and only if Ci,,- - -,Ci, is a 
circuit in M. We shall show that M is a matroid satisfying (C*) and the 
definitions of cycle in M and M agree. 

We show first that each circuit is a cycle in M. If (i,,---,Ci, is a 
circuit, then these columns are dependent; hence %a,0; is a cycle, with 
a = 0 Moreover a —1 --,%), for otherwise a 
proper subset of Ci,,- - -,Ci, would be dependent. Hence Ci,+---+ Ci, 
is a cycle. Next, any sum (mod 2) of circuits is a cycle, evidently. Next we 
prove (C*). Suppose Q=—Ci,+---+ Ci, isa cycle. Let -,kq) be 
a minimal subset of (%,,- - -,%) such that P=(Q;,+---+ Cy, is a cycle; 
then P is a circuit. Q@—P is a cycle; from it we extract a circuit, just as 
above, etc. It follows from (C*) that the definitions of cycle in M and M 
agree. Theorems 33, 34 and 35 now apply to M also. 

We are now ready to prove the final theorem: 


53% 
532 


THE ABSTRACT PROPERTIES OF LINEAR DEPENDENCE. 533 


THEOREM 37. Let M be any matroid satisfying (C*). Suppose 
p(M) =n, and ¢,+-°+++€n¢q is a base. Then if M, is any matrix of 
integers (mod2) with n—q columns which are independent (mod 2), 
columns Cn-qii,° * *, Cn can be adjoined in a unique manner to M,, forming 
a matria M of which the corresponding matroid is M. | 


Let Pi,:--,P, be a strict fundamental set of circuits in M with 
respect tO €n-gu1,° °°, @n (Theorem 9). Say €4, + 
Set =Ci, (mod 2); this determines as a column 
of 0’s and 1’s so that = Ci, Ci, + Cra is a circuit. is a 
cycle; (C*) shows that it is a single circuit, as C; +-°- +--+ Cn-q contains 
no circuit.) Cn_g.; evidently must be chosen in this manner. We choose the 
remaining columns of M similarly. Let M’ be the matroid corresponding to 
M. Then P’;,- - -, P’q is a strict set of circuits in M’. These same sets form 
a strict set in M; hence, by Theorem 35, the circuits in M’ correspond to those 
in M. Consequently M’ = M, completing the proof. 

We end by noting that the matroid M’ of § 16 satisfies Postulate (C*) 
but corresponds to no linear graph. For letting, 123 be a base and (16. 2) 
a fundamental set of circuits and determining the matroid as in Theorem 36, 
we come out with exactly M’. A corresponding matrix of integers mod 2 is 
constructed from (16.3) witha=—b—=—c—d=—=1; we interchange rows and 
columns in the left-hand portion, leave out the last row and column of the 
right-hand portion, and interchange these two parts. (The relation 2a = 0 
is of course true mod 2.) 

On the other hand, it is easily seen that if the element 7 is left out, there 
is a corresponding graph, which must be of the following sort: It has four 
vertices a, b, c, d, and the ares corresponding to the elements 1,- - -,6 are 


ab, ac, ad, bc, bd, cd. 


There is no way of adding the required seventh arc. 

The problem of characterizing linear graphs from this point of view 
is the same as that of characterizing matroids which correspond to matrices 
(mod 2) with exactly two ones in each column. 


HARVARD UNIVERSITY. 


& 


ON THE ASYMPTOTIC DISTRIBUTION OF THE REMAINDER 
TERM OF THE PRIME-NUMBER THEOREM. 


By AvuREL WINTNER. 


The result of the present note is to the effect that the Riemann hypothesis 
is equivalent not only with the best possible order of the remainder term of 
the prime-number theorem but—on a proper scale—-also with a generalized 
almost-periodic behavior of this remainder term. This means that the formally 
trigonometrical development of the remainder term is a Fourier development, 
which implies, in particular, the existence of an asymptotic distribution 
function. 

Let pi, be the sequence of distinct zeros of in the upper 
half-plane, so that 


(1) pr = 1/2 + ik, Yiu > ye > (k -) 
by assumption. Let m, denote the multiplicity of the zero px, and let 


=0, = + Am), where > 1, 


(2) Am = > px, (m = 1, 2," 
k=1 


It is known * that 
(3) ¢(z) = lim ¢n(z) 


m=O 
exists and that on placing, as usual, 


= log p 
Dp 


the “explicit formula” of the prime-number theory may be written as 


(4) —y(a) = (x) + 


where x p"; at the discontinuity points, c = p”, of y(a) one has to replace 
y(x) by the arithmetical mean of y(z-+ 0) and y(x—0). The rdle of (1) 
for the distribution of the prime numbers is? that of implying for the 
remainder term (4) of the prime-number theorem y(z) ~z the appraisal 
2/20 (2°) for any « > 0, and even the appraisal 


*Cf. E. Landau, Vorlesungen iiber Zahlentheorie, Leipzig, 1927, Theorem 452. 
2 Ibid., Theorem 453. 


534 


ASYMPTOTIC DISTRIBUTION OF THE REMAINDER TERM. 


(5) $(«) = O(log? «). 


Finally, (1) is equivalent * also with 


(6) (log fa — p(x) /x)* dz >> pr |?, o—> 


a relation which may be written, according to (4), in the form 


(6a) (log + log[2e(1 da 


00 
k=1 


Now not only (6a) holds but also 


(7) (log f — $m-1(%)] + log[2r(1 — a?) }? da 
> m*/| (m = 1, 2,° 0). 


If m=1, then (7) reduces to (6a) or (6) in virtue of (2). While for 
m= 1 the relation (7) is not clear from (6a), (3) and (2), a glance at the 
proof of (6) shows that the proof of (7) needs but a repetition of the proof 
of (6a), so that the proof of (7) will be omitted. 

On denoting the integrand of (7) by 


(7a) — + Dn(x), 
it follows from (1), (2) and (5) that 


Dn (x) = 2a-*/2[ (2) Jlog[2x(1 + 
= (log? 2) + 0(1)]O(1) = ; 


hence 


J de = — 0(1) = o(logw). 
2 2 
Thus it is clear from (7), (7a) that, for every fixed m, 
m 
1,@., 


T co 


=m 


*Ibid., Theorems 476 and 477. This result is due to Cramér. 


535 | 


536 AUREL WINTNER. 


Hence, on placing 


(8) M{g} lim g(z)de/T, 

1 
one has 
(9) M{(f —Sm-1)?} =2 px 
where Sm (2) $m(e"), f(x) 
so that 
(10) Sm (2) = > /py + /py) 
and 
(11) f(z) — me + 


in virtue of (1), (2) and (3). The relations (4) and (5) take the form 
(11a) =e f(x) + log 2r-+ O(e%), f(z) = O(2’). 


Now, from (9), 
(12) M{(f —Sm)?} ~0, 


Due to (10) and (11), the relation (12) might be expressed by saying that 
the trigonometrical series (11) not only is convergent but that it is the Fourier 
series of the function which it represents, i. e., that 


(12 a) f(a) ~ = /py + 


the equivalence sign ~ being understood in the Besicovitch sense.* It must, 
however, be mentioned that the averaging process (8) operates not in the 
symmetric range [— 7’, T] but only in the upper half of it; to the lower half 
of this range there corresponds the range 0 < + < 1 of (x) =0, where the 
behavior of the series ¢(7) is just as intricate as in the rangel <4< ow. 
Since, however, (11) is a pure sine series, the formal difficulty just men- 
tioned may be avoided by defining the function f(x) for —«0 <r@< —1 


T 
by f(z) ——f(—2) and for —1<21 arbitrarily. Then f in (8) 
1 
T 
may be replaced by (27) f , so that the function f(x) occurring in the 
-T 


fundamental formula (11a) belongs to the Besicovitch class B? in virtue of 
(12). As a consequence of this fact, it follows from the Besicovitch theory 


“A. S. Besicovitch, Almost periodic functions, Cambridge, 1932, Chap. II. 


| 
| 
3 
4 
k=1 
| 
| 


ASYMPTOTIC DISTRIBUTION OF THE REMAINDER TERM. 537 


that the coefficients > nx/px, M/pxr of the series (11) may be represented in the 
Fourier-Bohr manner as averages. It is clear from the definition of f(x) for 
4 << —1 that these expressions of the coefficients hold also if M{ } is under- 
stood in the sense (8). Unfortunately, nothing is known about the diophantine 
nature ® of the frequencies + yn of the Fourier expansion (12 a) ; in particu- 
lar, it is not known in what manner the frequencies are generated by a basis 
of linearly independent numbers. 

If g(x) is a real-valued measurable function defined for 1<24< 0, and 
if, for a given J’ > 1 and a given real number é, one denotes by [g(x) < €;T] 
the set of those points x of the interval 1 << «< T at which g(z) < é, the 
function g(x) is said to possess an asymptotic distribution function o = o(é) 
if at every continuity point é of this o the relation 


T-* meas [g(z) < €;T] 
holds, and 
o(— o)=—0, o(+ —1. 


Thus o(€) is monotone and not everywhere constant. The limit defining 
a(é) is zero for every € if, as x increases indefinitely, g(x) is very often very 
large; hence in such a case there does not exist an asymptotic distribution 
function. Now sm(z) is, according to (10), real-valued and almost-periodic 
in the Bohr sense and has therefore’? an asymptotic distribution function. 
Hence it follows * from (12) that f(x) also possesses an asymptotic distribu- 
tion function. This fact expresses a certain amount of regularity in the 
fluctuations of f(x) and implies, in particular, that | f(z)| cannot be very 
often very large. On the other hand,° 


(11 b) f(z) =Q,(loglogr), 0, 


so that neither f(z) nor —f(z) is less than a positive constant. While the 


*It is not known if m, = 1 for every k. 

* For a property of the numbers ¥,, which is, however, not of an arithmetical nature, 
ef. pp. 101-102 of this volume. 

7A, Wintner, “Diophantische Approximationen und Hermitische Matrizen,” I, 
Mathematische Zeitschrift, vol. 30 (1929), pp. 310-312. In the following year, Jessen, 
and Bohr and Jessen, also proved the existence of asymptotic distribution functions. 
_ Cf. also the programmatic address of Bohr in the Proceedings of the 5th Skandinavian 
Congress (1922). For the recent development of the distribution theory, cf. B. Jessen 
and A. Wintner, “ Distribution functions and the Riemann zeta function,” Transactions 
of the American Mathematical Society, July, 1935. 

*Cf. B. Jessen and A. Wintner, loc. cit., Theorem 24. 

°Cf. E. Landau, op. cit., Theorem 472, or H. Bohr, loc. cit. (11b) is due to 
Littlewood. 


= 


538 AUREL WINTNER. 


asymptotic distribution function cannot be everywhere constant, it seems to 
be difficult to decide whether it is nowhere constant*® or whether f(z) 
“ dislikes” some regions a < f(z) <b. The answer to this question might 
depend ** on the one mentioned at the end of the previous paragraph. 

The relation (12 a) may be expressed also in terms of the Dirichlet series 


oo 
F(s) —2 Rs > 0. 
=1 
It is known ** that (11 b) depends on the behavior of 
SF(s) as Rs— + 0. 


Now 3F(s) on the boundary line fs = 0 not only is a convergent trigono- 
metrical series but is also the Fourier series of the function which it represents. 
In fact, the number of zeros of {(s) in the strip 0 << Js < T is O(T log T), 
even if one counts every zero according to its multiplicity. This implies the 
convergence of the series 


|. 


Hence it follows from (11) that the series representing the sum of f(x) and 
23F (ix) is a trigonometrical series which possesses a convergent majorant 
uniformly for all x and is therefore almost-periodic in the sense of Bohr. 
Accordingly the trigonometrical series 


oo 
k=1 


which defines the function JF (iz), belongs to this function as its Fourier 
series (B*). In particular, ¥F (ix) is a function of class B? and has therefore 
an asymptotic distribution function. 


THE JOHNS HOPKINS UNIVERSITY. 


Cf. A. Wintner, “ Remarks on the Ergodic Theorem of Birkhoff,”’ Proceedings 
of the National Academy of Sciences, vol. 18 (1932), p. 251. 

11 Cf., in this connection, B. Jessen and A. Wintner, loc. cit. 
44 Cf. E. Landau, op. cit., Theorem 470. 


H4 
4 
ig 


ON THE EXACT VALUE OF THE BOUND FOR THE REGULARITY 
OF SOLUTIONS OF ORDINARY DIFFERENTIAL EQUATIONS. 


By WINTNER. 


Let f(z,w) be regular-analytic and bounded in the four-dimensional 
domain | z| <a, |w| <b, and let w= w(z) be the solution of dw/dz = f(z, w) 
which vanishes at z 0. Let M denote the least upper bound of | f(z, w)| 
in the domain |z| <a, |w|<b. It is known that there exists a bound 
(a,b, M) which is independent of the particular choice of f(z,w) and 
is such that w(z) is regular-analytic in the circle |z| <1. In fact, the 
method of successive approximations yields the estimate 


(1) T'(a,b,M) = min(a,b/M). 


The necessity of the limitation | z | < a is obvious from the case where f(z, w) 
is independent of w and has singularities on the circle | z|— a. On the other 
hand, the necessity of the limitation | z | < b/M is not evident. In fact, the 
| latter limitation is introduced into the proof of (1) only for a somewhat 
artificial reason,—in order to assure the possibility of successive substitutions 
into f. 

It turns out, however, that the trivial, and a priori artificial, appraisal (1) 
cannot be improved, i.e., that the value of the best bound T(a,b,M) 1s 
precisely min(a,b/M). This situation seems to be unexpected insofar as 
efforts have been made’ to improve the lower estimate (1) of I'(a,b,M). 

r In reality, these efforts succeeded only by imposing additional restrictions on 

J f(z,w). Such a restriction is that f(z, w) satisfies a uniform Lipschitz con- 
dition in the open domain |z| <a, |w|<; and the corresponding im- 
proved estimate of the regularity radius of w(z) depends * not only on a, b, M 
but also on the Lipschitz constant. A proof of the upper estimate 


(2) T'(a,b, M) = min(a, 6/M), 


which clears up this situation, runs as follows. 
If a=b/M then (2) is obvious from the case where f(z,w) is in- 
- dependent of w. In order to prove that (2) holds also when a > b/M, it is 


, clearly sufficient to show that for any given pair of numbers b, M and for any 


Cf. P. Painlevé, Encyklopaedie der Mathematischen Wissenschaften, vol. 1, Part I, 
Pp. 194 and p. 200. 


539 


540 AUREL WINTNER. 


given number r, where r > b/M, there exists a function f which is independent 
of z and possesses the following properties: 


(i) f(w) is regular-analytic and bounded in the circle | w | < 6; 
(ii) the least upper bound of | f(w)| in this circle is M; 
(iii) the function w=—w(z) for which dw/dz=—f(w) and w(0) =0 
has in the circle | z | < ra singularity. 


Now the function 


(3) f(w) =M[(1 + w/b) 


satisfies all these conditions if n is sufficiently large, larger than a number 
depending on r. In fact, the solution w(z) belonging to (3) is 


w(2) = B[(1 + — 1] 
where 
Cyn = (1 — 
Thus z= —C, is a singular point of w(z) and tends to z—=—b/M when 
m—>-+ co. This proves (iii), while (i) and (ii) are satisfied by (3) for 
any n. 


THE JOHNS HOPKINS UNIVERSITY. 


{ 
if 
if 
‘ 
4 


ON SYMMETRIC BERNOULLI CONVOLUTIONS. 


By Ricuarp KERsHNER and AuREL WINTNER. 


A class of symmetric Bernoulli convolutions which are regular analytic 
in the whole plane or in a strip containing the real axis or which possess, 
at least, a high degree of smoothness along the real axis, has recently been 
considered by one of the authors.t. The present note deals mainly with the 
other extreme case, where the convolution possesses but a very low degree of 
smoothness. One class of convolutions which will be considered includes, for 
instance, the well known Cantor function; and other functions which have 
been treated in the literature also occur. These and some other examples have 
been collected in a joint paper of B. Jessen and one of the present authors.? 
The present note attempts a systematic treatment of a type of these symmetric 
convolutions with a low degree of smoothness. Apart from the theory of 
infinite convolutions, the functions to be considered are of interest from the 
point of view of the theory of real functions. The dominating feature of some 
of the convolutions in question is the homogeneous character of their spectra 
and a corresponding homogeneity of the mapping involved. In particular, 
one is lead to absolutely continuous convolutions which might be termed 
length-preserving with respect to a nowhere dense set of positive measure. 
It turns out that Bernoulli convolutions of this type are identical with the 
functions ¢(z) considered by Hausdorff* in connection with his fractional 
dimension theory. While Hausdorff is mainly interested in the case where 
the Lebesgue measure, which will be denoted by »(F), is zero, his results hold 
for the case of a positive Lebesgue measure also, a case with which the present 
paper is mainly concerned. A class of Bernoulli convolutions which might be 
termed complementary to the case of the Hausdorff functions ¢(z) also is 
considered. 

Let B(x) denote the symmetric Bernoulli distribution of standard de- 


*A. Wintner, “On analytic convolutions of Bernoulli distributions,” American 
Journal of Mathematics, vol. 56 (1934), pp. 659-663; “On symmetric Bernoulli con- 
_ Volutions,” Bulletin of the American Mathematical Society, vol. 41 (1935), pp. 137-138. 

*B. Jessen and A. Wintner, “Distribution functions and the Riemann zeta- 
function,” Transactions of the American Mathematical Society, vol. 37 (1935), $6, 
Theorem 11. 

*F. Hausdorff, “ Dimension und fdusseres Mass,” Mathematische Annalen, vol. 79 
(1919), pp. 157-179. 


54] 


t 
n 
or 
| 


542 RICHARD KERSHNER AND AUREL WINTNER. 


viation 1 so that B(x) is 0, 1/2 or 1 according as z is on the left, in the 
interior or on the right of the interval —1<24< +1. Thus B(2z/a), where 
a > 0, also is a symmetric Bernoulli distribution function; the distribution 


function 

(1) 4(1-+ sign z),- 

which belongs to a = + 0, will not be considered as a Bernoulli distribution 
function. The infinite Bernoulli convolution 


(2) a(x) = B(x/a,) * B(t/a2) 
is convergent if and only if 
(3) <+ 0, 


and the convergence of (2) implies its absolute convergence.* It will always 
be supposed that (3) is satisfied. The function (2) always is continuous.’ 
Further,® if o(z) is not absolutely continuous, it is singular, i.e. such that 
o’(z) =0 almost everywhere. The spectrum S of o(a), defined as the set 
of points z in the vicinity of which o is not constant, consists * of those points 


2 which are representable in the form 


oo 
(4) an, 
n=1 


where the signs depend on n in an arbitrary way, the only restriction being 
that the series be convergent. Hence 8 is a bounded set or the whole real 


axis according as the condition 

(5) > An + 
n=1 


is or is not satisfied. Examples show * that o may be singular or absolutely 
continuous whether (5) is satisfied or not, so that all four possibilities actually 
occur. The set S is always perfect, since the set of points in the vicinity 
of which a continuous function is not constant is either perfect or empty. 
On denoting by p,» the infinite convolution 


(6) = B(£/dns1) * B(L/Ons2) 


so that p, tends, as n—»+ o, to the distribution function (1), either all 


functions 


*B. Jessen and A. Wintner, loc. cit. Ibid. Ibid. * Ibid. 
® Ibid., examples 1, 3, 5, and 6. 


f 
| 
is 


the 
here 
tion 


tion 


that 

set 
ints 


ing 
real 


all 


ON SYMMETRIC BERNOULLI CONVOLUTIONS. 


(7) Po = 9; P2.° 


are singular or all are absolutely continuous. For if pn»(x) is singular, then 
the derivative of 


pn-1(&) = pn(@) * = + an) + pn(t— an) J 


is zero almost everywhere; and if pn(z) is absolutely continuous, then so is 
pn-1(@), since absolute continuity cannot be lost by the convolution process.° 
It may be mentioned that if S, denotes the spectrum of pn, then either all sets 


(8) So = 8, Bay’ 


are nowhere dense or none are nowhere dense. For if S, contains an interval, 
Gy and Sy + dn, the logical sum of which 
is S,_, in virtue of (4); further, if 8, contains an interval, then so does Sy41. 
For if J be an interval in S», then J — d,, either has an interval in common 
with the perfect set S,,, or contains a subinterval J —a,j,, which does not 
contain any point of S,,,,. In the latter case J + any: is contained in Sn,, by 
the definition of S,, in terms of Sn,,, so that in either case S,,, contains an 


then so does each of the sets ?° S, 


interval. 
From now on it will be supposed that S is bounded, i.e., that (5) is 
satisfied, so that one may introduce the remainders 


(9) m= am (n = 0,1,2,°° 
m=n+1 

The following theorem will now be proven: 


If 
00 
(10) dn > Om = Tn 


m=n+1 


°In fact, absolute continuity, in the case of a distribution function y(a#), means 
that there exists for every ¢ >0 a 5=4(e) such that 


| — <e whenever > | | < 4. 
k=1 k=1 


@) 
Since the latter inequality implies that > |(a’,—y) — (@", —y)| <8 for any y, 
k=1 


CO 
it also implies that > | ¥(7%,—y) —¥(@",,— y)| < € for any y. Hence it implies that 


© +00 +00 +00 
f — y) dw(y) — f — y)dw(y) dw(y) =e 
k=1 -00 -0O 


for any distribution function w or that y* w is absolutely continuous for any distribu- 
tion function w. 


” By E+ is meant the set of points representable in the form #2 +c, where @ 
18 number contained in ZF. 


6 


543 
4 
ally 
ty. 
id. 
| 


544 RICHARD KERSHNER AND AUREL WINTNER. 


for every n, then the spectrum 8 of the infinite Bernoulli convolution (2) 
is nowhere dense and has the measure 


(11) =2 an— Gm) = 2 lim 2"7,, 
n=1 n=1 m=n+1 n=00 
which may or may not be zero. If u(S) >0 and X denotes the interval 
[— x], then 
(12) a(x) = p(SX)/p(S), 


i. e., the measure of the o-image of an z-interval is proportional to the portion 
of the nowhere dense spectrum contained in the interval. In particular, 
o(2z) is singular or absolutely continuous according as = 0 or p(S) > 0. 
If it is absolutely continuous, then its density is bounded ™ since o satisfies 
a uniform Lipschitz condition. 

The fact that »(S) > 0 implies absolute continuity is of interest since if 
(10) is not satisfied, then p(S) > 0 is necessary but not sufficient for the 
absolute continuity of o(z); cf. (23).7** That in the case (10) the con- 
dition »(S) > 0 is sufficient for absolute continuity is clear from the relation 
(12), since (12) implies the uniform Lipschitz condition 


| o(2’) —o(2”)| SC | 2’ |, 


the best value of C being 1/u(S). The second representation of »(S) given 
in (11) shows that —0 or > 0 according as rn =0(2") is or 
is not satisfied, and that (10) implies the existence of lim 2"rp. 

It is seen from (9) that (10) may be written in the form 


(13) > Bas (n = 0, 7 


This clearly implies the possibility of a successive construction of open sub- 
intervals J of the closed interval [— 7,79] as follows. Let J,. denote the 
open interval which is symmetric with respect to the mid-point of [— 1, ro] 
and is of length 2(7,—2r,). From each of the two closed intervals K,, Ky 
which constitute [— ro, 77] —J12 one may remove an open interval of length 
2(r,; —2r.) and having the mid-point of Ki, where i= 1,2, as mid-point. 
This is possible since K, and K, are each of length 4r, > 2(r,; —2r2). Let 
Ji, and Jz, denote the open intervals thus removed and let J,, be the one 


11 Up to a set of measure zero. 

118 For another example of this type cf. A. Denjoy, “Sur quelques points de la 
théorie des fonctions,” Comptes Rendus, vol. 194 (1932), pp. 44-46, and H. Minkowski, 
Gesammelte Abhandlungen, vol. 2 (1911), pp. 50-51 and fig. 7. 


— 
iq 
13 


or 


ON SYMMETRIC BERNOULLI CONVOLUTIONS. 545 


which is to the left of J,.. From each of the four closed intervals which 
constitute 
[— 10, To] —J12 —J14 — J 34 


one may remove open intervals J1s, J3s, J5s, J7s in the same symmetric manner 
as J;, and J34 have been removed from K, and Ko, it being understood that 
each of the four intervals Jzg is of length 2(rz— 2r3) and that Jz is on the 
left of Jng if k < h. On continuing this process one obtains for every n __ 


1 2” intervals of length 2(1n — 
(14) where k = 1,3,5,-- +, (n =0,1,2,: °°). 


It is convenient to write the double subscript of Jpg as a fraction by placing 
Jnqg = J p/q 80 that J; is defined for every number ¢ of the form 


(15) t = > b,/24, where 0 or b; — 1, 
j=1 


i,e., for every number of the interval 0 < ¢ < 1 having a finite dyadic develop- 
ment. J; and J, are, by their successive construction, disjoint if t Au. Now 
it is easy to verify ’* from the definition of B(z/an) and from that of the 
convolution operator * that 


(16) o(x) =¢ if x is in Jt. 


Since the points (15) lie dense in the interval 0 < ¢ < 1 and since the dis- 
tribution function o(2) is everywhere continuous, it follows that every sub- 
interval of [— 19,7] contains a J;. Consequently, the set 


(17) [— To, ro] Jt, 


where ¢ runs through all values (15), is nowhere dense and consists of the 
cluster points of the endpoints of the open intervals J:. Hence it is clear 
from (16) and from the definition of the spectrum S that the set (17) is 
contained in S. Since S is a subset of [—~ro,r)] in virtue of (4), the set 
(17) is precisely S. Accordingly, J; and Jy being disjoint if t ~ u, 


= 1, 70]) — Jt) = 2ro— 
80 that 
w(S) = 2" (1'n — 
n=0 


in virtue of (14). On comparing this with (9) one obtains (11). 


* Cf. B. Jessen and A. Wintner, loc. cit. 


| 
), 
| 
e 
n 
on 
t 
ib- 
he 
th 
nt. | 
et 
yne 
ski, 


546 RICHARD KERSHNER AND AUREL WINTNER. 


It must now be shown that in the case »(S) > 0 the relation (12) holds." 
Since o(z) is non-decreasing and continuous, it is sufficient to verify (12) 
for a dense set of points z. Now [— 1, 0] contains S, so that (12) is trivial 
if is not in [—7 Since ~ J+ is dense in [—1o, 70], it follows that 


it is sufficient to verify (12) for the points 2 of a J;. Let x be in J¢ and let 
t= Then the 2” — 1 intervals J+, where t = 7/2" and j — 1, 3, 5,-- -, 
21; —1,2,3,- - -,m, decompose § into 2” congruent parts, each of which 
has the measure 2-"u(S) since the intervals J; have been removed sym- 
metrically. Since there are, among the 2” congruent parts of measure 2-°y(), 
exactly & on the left of the point z, one has 


p(SX) = (8). 


This proves (12) since ¢—k/2™, and o(z) =¢ by (16). 

The Hausdorff theory ‘* of A-measure and its further development by 
Besicovitch *° allow, of course, an analysis of the case »(S) = 0 also. 

As an illustration of the theorem, let 


(18) Oy, = Aa" + Bb", where 0<a<b<1,A>0, 
It is easily verified that (10) is satisfied if and only if 
(19) b<1/2 


and that (11) gives u(S) —0 or = 2B, according as b < 1/2 or b= 1/2. 
Thus if 
(20) a, = B(1/2)" + Aa", where 0<a< 1/2, B>0, A>0, 


then (2) is absolutely continuous with a nowhere dense set of positive 
measure as spectrum and is represented by the formula (12). 

The infinite convolution (2) belonging to the sequence (18) in the case 
A =1, B=0 will be denotel by oa(z), so that 


(21) oa(t) = B(x/a) * B(x/a*) * (O<a<1). 


Since (19) takes, in the case B= 0, the form a < 1/2 in virtue of a < }, 
the function oa(x) is singular with a spectrum S of zero measure if a < 1/2. 
In particular, o,/;(2) is the usual Cantor function considered by Lebesgue. 


18 Cf. F. Hausdorff, loc. cit., § 11. 
14, Hausdorff, loc. cit., §§ 10-12. 
16 A. §. Besicovitch, “On linear sets of points of fractional dimension,” Mathe- 
matische Annalen, vol. 101 (1929), pp. 161-193. 


by 


ON SYMMETRIC BERNOULLI CONVOLUTIONS. 547 


On the other hand, o1,/2(2) is the distribution function of the “ Abrundungs- 


fehler,”’ i. e., 
= 1)/2 if —1 


and S = [—1,1], so that o1/2(z) is absolutely continuous with a bounded 
density. This example shows that on replacing > in (10) by =, the spectrum 
8 of (2) may become an interval. Since the sequence {(1/2)"} consists of 
the two sequences {(1/4)"} and {(1/4)"/2}, it is clear that 


(22 a) * /4(22) = 01/2(2) ; 


this relation is an instance of the fact that the convolution of two singular 
Bernoulli convolutions may be absolutely continuous.’® On the other hand, 


(22 b) * o1/4(@) 


is singular. This is shown in the same way”? as for o1/3(%) * o1/3(4). The 
interest of the latter example lies in the fact that although it is singular, 
the spectrum is an interval, as will follow from (23). Besides, (23) will 
show that the spectrum of oa(2) is an interval not only in the limiting case 
a=1/2 but in the case 1/2 << a< 1 as well. 

If > in (10) be replaced by S, then S becomes connected : 


(23) [—ro, ro] if an Stn (n == 1,2,* +). 


This is easily seen from (4) and (9) by an obvious extension of the proof of 
Riemann’s theorem according to which 


0<am—->0 and Ya=—=+o 
n=1 


imply that every real number is representable in the form (4). The examples 
An = (1/2)" and don = deny: = (1/3)” 


show that in the case (23) both absolutely continuous and singular con- 
volutions (2) are possible. 

If (10), i.e. G2 > Tn, is satisfied not for every n but only for sufficiently 
large values of n, then S is still nowhere dense since S, in (8) is then nowhere 
dense if n is sufficiently large. If an Sn is satisfied not for every n but only 
for sufficiently large n, then S, is, according to (23), an interval if n is 


“Cf. P. Lévy, “Sur les séries dont les termes sont des variables éventuelles 
indépendantes,” Studia Mathematica, vol. 3 (1931), p. 153. 
** Cf. B. Jessen and A. Wintner, loc. cit., example 2. 


lat 

let 

h 

); 


548 RICHARD KERSHNER AND AUREL WINTNER. 


sufficiently large, so that S = S, consists of a finite number (= 1) of intervals 
in virtue of (4). The following example shows that S may consist of an 
arbitrarily large number of disjoint intervals. 

For a given a > 1, let o*(x) denote the Bernoulli convolution belonging 
tO dn = so that 


(24) (x) = * B(x/2*) * B(a/3*) -, 


and let S* denote the spectrum of (24). Since a, = n~™ satisfies dm = ry for 
sufficiently large n if @ is fixed, the spectrum S* consists of N = Ng disjoint 
intervals. It is easy to see that, as ~—»-+ oo, the number of intervals in- 
creases indefinitely while S* shrinks to the set consisting of the pair of points 
x=+1. In this connection it is interesting to mention that?*® o%(z) 
possesses, for every fixed a, derivatives of arbitrarily high order for every z, 
although o@(z) cannot, of course, be analytic at the end points of the N, 
intervals which constitute S*. It is not known whether o*(z) is or is not 
analytic in the interior of these intervals. Since every n may be written 
uniquely in the form n = 2*(2m +1), it is clear from (24) that ?® 


(25) * o¢(4/3*) * o¢(4/5%) *- where c = (1/2)* < 1/2, 


so that o-(z) is singular with a spectrum of zero measure. It may be men- 
tioned that if a is near enough to 1, then Nz —1, and that S¢—> [— ow, o] 
asa—>1-+ 0. 


THE JOHNS HOPKINS UNIVERSITY. 


18 A, Wintner, loc. cit. 
1° Cf. P. Lévy, loc. cit., p. 154. 


ng 


ON UNIFORM CONVERGENCE. 


By J. W. THEODORE SUCKAU. 


Introduction. While in Classical Analysis it is always explicitly pre- 
supposed that a convergent sequence of functions is uniformly convergent, 
it was first observed by Egoroff* that a very strong type of approximate 
uniform convergence is automatically present in every convergent sequence 
of measurable functions. 

The theorem of Egoroff has led to many investigations, in particular by 
F. Riesz * who has applied the theorem in a very interesting manner to the 
Lebesgue theory.* 

It is the purpose of this paper to investigate the phenomenon of uniform 
convergence in a general way so as to obtain a better understanding of the 
situation in the case of measurable functions. 


I. Uniform Convergence. Suppose that the sequence f,(z)* is defined 
and convergent on a set S. If the sequence is uniformly convergent on some 
subset of S, that subset is said to have the character U (and is designated by 
the letter U). 

We introduce a class of sets, ¥, namely the totality of subsets of S having 
the character U. 

The class © is neither the null set nor does it contain only the null set, 
since any subset of 9 with a finite number of elements is a subset of character 
U. Moreover, the addition of a finite number of points of S to an element 
of W yields another element in ¥. Hence, except in the trivial case when the 
sequence is uniformly convergent on § there is no largest subset in the class ¥. 


*D. Th. Egoroff, “Sur les suites des fonctions measurables,’” Comptes Rendus, 
Paris, vol. 152 (1910), pp. 244-246. 

*F. Riesz, (i) “Sur l’intégrale de Lebesgue,” Acta Mathematica, vol. 42 (1920), 
pp. 191-205; (ii) “Sur le théoréme de M. Egoroff et sur les opérations fonctionnelles 
linéaires,” Acta Litt. Sci. Szeged, vol. 1 (1922), pp. 18-25; (iii) “ Elementarer Beweis 
des Egoroffschen Satzes,” Monatsheften fiir Mathematik und Physic, vol. 25 (1928), 
pp. 243-248. 

*F. Riesz, footnote * (i), above; loc. cit., pp. 196-205. 
*Unless otherwise stated all functions are entirely unrestricted. 


alg 
an 
or 
nt 
n- 
ts 
ot 
en 
1/2, 
n- 
549 


J. W. THEODORE SUCKAU. 


In the special case when 8 is itself an element of W it is the largest 
element and the class W is coincident with the totality of subsets of 8. How- 
ever, even when this is not the case it is possible by a very simple process ° 
to determine the class ¥. This process is fundamental in the paper. 

Let S(k,v) be the subset of S on which the inequality 


| fn(z) —fm(x)| S1/k 
holds for every n,m = v. 
Take any sequence of positive integers 


(v) > Viz V2) V3," ° 


Finally, consider 


U(v) — 11 


THEOREM. The sequence fn(x) is uniformly convergent on U(v), and 
conversely, if the sequence fn(x) is uniformly convergent on a subset S* of 8, 
then there exists a sequence (v) such that U(v) contains S*. In other words 
the totality of sets U(v) together with their subsets is the class WV. 


Proof. To prove the first part, take any e > 0 and pick k so that 1/k S«. 
Now every z in U(r) is also in S(k, %), and so for every z in U(yv) it is true 
that | fn(x) —fm(x)| S1/kSe for n,m = vy. Hence the sequence is uni- 
formly convergent on U(v). 

Conversely, if fn(z) is uniformly convergent on S* then given any posi- 
tive integer k there exists a % such that for all. n,m = vx it is true that 
| fn(2) —fm(x)| S1/k for every x in S*. Thus a sequence (v) : v1, v2, v3,°** 
has been defined. 8* is contained in S(k%,) for every value of k and there- 


fore S* is contained in [] S(k, mm) = U(v). 
k=1 
1.2. Now that the class © has been determined it is interesting to 
examine its elements. We know that every finite subset of 9 is contained 


in v. Under certain conditions we are assured that at least one infinite 
subset of S is a member of WV. 


THEOREM. If § is non-denumerable, then there exists a denumerably 
infinite subset on which the convergence is uniform. 


Proof. There is an a, such that S(1,«;) is non-denumerable. This is 


* Though not explicitly stated by Riesz, the process used by him is essentially the 
same. Footnote ” (iii), p. 549, loc. cit., p. 244. 


550 
| 


st 


ON UNIFORM CONVERGENCE. 551 


so since 9 = > S(1,v) and hence if every S(1,v) were denumerable it would 


follow that S had this property, which is contrary to hypothesis. 

Pick xz, from S(1, 

Consider the sequence now as defined only on S(1,a,) which is non- 
denumerable. There is an a such that when the construction of 1.1 is con- 
sidered with respect to S(1,a,), then 9(2,a.:2) is non-denumerable. 
(S(2,a2:2) is the set of points of S(1,a,) where | fn(z) —fm(x)|=4 
for n,m = @). 

Pick 2, from S(2, a: 2). 

Moreover, there is a 82 such that for all n,m = B, 


| fn(s) S (1=1). 


Consider the sequence as defined only on S(k —1, %1:/—1) which is 
non-denumerable. There is an a such that when the construction of 1.1 is 
considered with respect to S(k—1, %1+:k—1) then S(k,o:k) is non- 
denumerable. (S(k,%:) is the set of points of S(k —1, a1: k-—1) where 
| fn(x) —fm(x)| S1/k for n,m = 

Pick 4 21, 2, * *, %-1 from S(k, k). 

Moreover, there is a B®, such that for n,m = px 


| fn (xi) — S1/k [i 1, 2, 3,- (k—1)]. 


Choose = &&, Br Bi 


Define (v) : v1, vo, v3," * 

Then U(v) contains the set 21, 22, * 

certainly contains the set 2%, Vivo,’ *, since S(k, k) 
contains S(k +1, +1); while 2,22, 2%3,: is included in 
S(k, Furthermore, S(k,v) contains both the sets S(k,om:k) and 
S(k, and so it contains the denumerably infinite set 2%2,%3,: for 


every value of k. 

This proves the theorem. 

That the non-denumerability of S in the hypothesis of the above theorem 
is essential is shown by the following example. 


0 on 1, 1/2, 1/8,- --,1/n 
Define f(z) — { /2, / ? / 

1 on 1/n+1, 1/n+ 2, 1/n+3,---. 
is the largest of this set of values, ++, 0, the smallest. 


p=1 
d 
Y 
Is 
e 
0 
i 


552 J. W. THEODORE SUCKAU. 


Now fn(z) — 0 on the set {1/n}, and yet on any infinite subset the con- 


vergence is not uniform. 
This example and the above theorem may be combined in the statement 


of the following theorem: 


A necessary and sufficient condition that each convergent sequence of 
functions defined on S have associated with it a denumerably infinite subset 
of S on which the convergence is uniform, ts that 8 be non-denumerable. 


1.3. The question arises as to whether or no this theorem is the best 
of its kind. We might perhaps always have a subset of character U which 
is non-denumerable. We are going to develop an example to show that this 
is not in general possible if we assume the hypothesis of the continuum, that 
2Xo==8,. In other words the above theorem is in a way final. 

The example we shall develop is that of a sequence of functions con- 
verging to zero on the whole continuum and yet uniformly convergent on 
no non-denumerable subset. 

Before going to the construction of the example we shall state two 
auxiliary theorems. 


AvuxiLtiary THEOREM I. If a null sequence’ of functions is uniformly 
convergent on a set 8, then there ts a null sequence of numbers which domi- 
nates ® the sequence of functions on S. 


This is merely a restatement of the definition of uniform convergence on 8. 


AuxILiARy THEOREM II. Given any denumerable aggregate of null 
sequences (of numbers) there exists a null sequence dominated by none of the 
sequences of the aggregate. 


Proof. Suppose that the given sequences are: 


A115 Qis,' 0 


(i) Pick the first value of n, say v(1) such that | a1.) | <1. 
Cancel the first v(1) columns. 


* A null sequence is a sequence converging to zero. 
*The sequence a, dominates the sequence b,,(#) on S if there is a subscript ”, 
such that for all n > n), a, = b, (x), on 8. 


fi 
j 
i 
i 
i 


t 
§ 
t 


n 


ON UNIFORM CONVERGENCE. 553 


(ii) Pick the first remaining value of n, say v(2) such that | a1,»2)| < 1/2. 
Cancel the next v(2) —v(1) columns. 

Pick the first remaining value of n, say v(3) such that | a2,.3)| < 1/2. 
Cancel the next v(3) —v(2) columns. 

(iii) Pick the first remaining value of n, say v(4) such that | a1,.4)| << 1/3. 
Cancel the next v(4) —v(3) columns. 

Pick the first remaining value of n, say v(5) such that | d2,.5)| < 1/3. 
Cancel the next v(5) —v(4) columns. 

Pick the first remaining value of n, say v(6) such that | ds,n6)| < 1/3. 
Cancel the next v(6) —v(5) columns, 


Now construct the sequence ),, - in this wise: 
b,=—b,=—: = = 1 


The sequence {b,} is a null sequence and it is evident that it is dominated 
by none of the given sequences. 

We may now go to the actual construction of the example. 

Normally order the following sets: (i) all real numbers, (ii) the totality 
of null sequences of numbers. 


(i) Sy * * 
(ii) A,, A2, As, yy’ (A< QQ) 


where © is the ordinal which initiates the cardinal c. 
The required sequence will be defined as an N,-valued function, B(z). 
Define B(x) to be the first sequence in (ii) which is not dominated by 

any Ay for ally. Suppose that the sequence is Aya). 

It follows from Auxiliary Theorem II that there is a null sequence not 
dominated by Ay for all yA; for, these null sequences constitute a de- 
numerable aggregate, as A is ordinally less than © the ordinal which initiates 
the cardinal c, and it is assumed that c—2%*—§,. Thus the definition of 
B(x) is complete. 

B(x) is possibly dominated by a null sequence, say Av, only for 2’s with 
subscripts A < v; for if vA, then by definition B(<,) is not dominated by A». 
In other words B(x) is possibly dominated by Ay only for the z’s in the set 


| 

t 

n 

0 

ll 

0 


J. W. THEODORE SUCKAU. 


Lo, Lg, ° * “(B<yv) 


that is on at most a denumerable set. 
Suppose that the sequence Ay,a) is 


Define = In. 

Evidently f,(z) +0. Moreover, the sequence fn(z) is uniformly con- 
vergent on no non-denumerable set; for if it were uniformly convergent on 
some non-denumerable set, then, by Auxiliary Theorem I, it would be domi- 
nated by a null sequence on this set, which would imply that B(z) is dominated 
by a null sequence over a non-denumerable set, and this has been shown to 
be impossible. 


1.4. Let us summarize these results in the following manner. In 
examining the elements of the class W six possibilities present themselves. 


(i) The class © is the null set. 
(ii) The class © contains only the null set. 
(iii) The class © contains only finite subsets of S. 
(iv) The class ¥ contains only denumerable subsets of JS. 
(v) The class © contains at least one non-denumerable subset of 8. 
(vi) The class © contains the set 8. 


The results of the preceding section show that: 
I. If § is itself denumerable, then (i) and (i) are impossible, but it is 
possible for (iii), (iv) and (vi) to occur. 


II. If 8 is itself non-denumerable, then (i), (ii) and (iii) are im- 
possible, but it is possible for (iv), (v) and (vi) to occur. 

Hence it appears that in a special case where § is non-denumerable and 
no U is non-denumerable, the sequence is behaving with the utmost possible 
stubbornness with respect to the property of uniform convergence. 


II. Approximate uniform convergence. So far in our investigation of 
the sets U in the class ¥ we have confined ourselves to their power. We now 
make a finer distinction and consider their measure. 

A sequence fn(x) is said to be approximately uniformly convergent on 8 
if for every positive « there is a U such that m.(S—U) <.«.® We would 
like to find conditions on the set § and the functions f,(2z) which are sufficient 
to insure the above phenomenon. 


*We might use a weaker inequality: m,(S)—m,(U) <e. As far as I know 
this does not lead to any results. 


554 


er 


ON UNIFORM CONVERGENCE. 555 


A set of sufficient conditions is supplied by the theorem of Egoroff.1° We 
have already alluded to this theorem and the work of F. Riesz in deriving 
simple proofs. Perhaps the simplest of these proofs is that of the Monatshefte.™ 
In the light of the first chapter it is in my opinion possible to get a better 
appreciation of this beautiful proof of Riesz. 

As a first trivial remark ? let us point out that if all the sets S— S(k, v) 
are closed, then © contains S; i. ¢., the sequence 1s uniformly convergent on 
the whole set 8. 

Convergence on the part of the sequence fn(z) implies that S — S(k, v) 


contains S—S(k,v-+1), and [J {S—S(k,v)} —0** for every &k. Hence 
p=1 

for any particular / it is true, since these sets are all closed, that there exists 


some vz such that S — S(k, = 0. 
Pick ¥;, v2, vs, *° * * a8 the sequence (v) and consider the set 


U(v) 


oO 
Since S— U(v) = > {S—S(k, ~)} —0 it is clear that the sequence 
k=1 
is uniformly convergent on the whole set 9. 


2.2. One might make the intuitive remark that if all the sets S—S(k, %) 
are nearly closed then the sequence is uniformly convergent on very nearly 
the whole set 8. 

In this connection let us make the following definition, keeping as close 
as possible to the familiar notions of classical analysis. 

A set S is approximately closed ** if for every « > 0 there exists a set s 
such that me(s) <« and S —s is closed. 

Before proceeding to discuss the above intuitive remark we shall state a 
few properties of approximately closed sets. All sets mentioned are assumed 
to be bounded. 


2.3. The sum of two approximately closed sets is approximately closed, 
2.4. A bounded open set 1s approximately closed. 


2.5. The complement of an approximately closed set in an interval is 
approximately closed. 


7D. Th. Egoroff, footnote 1, p. 549, loc. cit., p. 244. 
 F, Riesz, footnote ? (iii), p. 549, loc. cit., pp. 244-246. 
%8 This same trivial remark forms the nucleus of the latest proof by Riesz, footnote * 
(iii), p. 549, loc. cit., p. 244. 
*8 FE =0 means that £ is the null set. 
** An approximately closed set is clearly measurable, and conversely. 


q 
] 


J. W. THEODORE SUCKAU. 


2.6. The difference of two approximately closed sets is approximately 
closed. 


2.7%. The product of any number of approximately closed sets is approzi- 
mately closed. 


2.8. If the elements in a sequence of approximately closed sets have the 
character that each contains the next and their product is empty, then their 
exterior measures approach the limit zero. 


These properties are capable of immediate proof.’* By way of illustration 
we shall demonstrate the last, 


Suppose that the sets are 98, S2,S3,--- and iI Ss =0(0. Given any 


e > 0 pick a series of positive terms e, + €2 + €3 +° . Now Sn=Cn +3, 
where is closed and me(Sn) < €n, for every value of n, 
Define C*, Then Il C*, = Il C,=0, and con- 
tains C*,. 

Since C*, contains C*,,,, and all of these sets are closed, after a certain 
My all the sets C*, are empty. Hence for “Sg n > N we have me(C*,) = 0. 

If z is not in C*, but is in 8S, = S,-S.-S3- + + Sn, then a is not in some 
Cm, (1S m Sn) but is in all the S;, = 1,2,3,---,n). Hence z is in 3m. 

Therefore C*¥, + + 83; -+- +--+ contains 8, which contains 
C*, and so 8S, = C*, + s*, where me(s*n) <6, +e test: ten. 

For n> : me(Sn) S me(C*n) + me(S*n) CO +a t+etet: 
+ en <e. 

Hence m.(S,n) > 0. 


2.9. We may now turn to the intuitive remark of 2.2 and show that 
if all the sets S—S(k,v) are approximately closed then the sequence ts 
approximately uniformly convergent.'® 


Proof. Given any «>0O take a convergent series of positive terms 
aQtetet 

Since S— S(k,v) contains S—S(k,v-+1) and I {S — S(k,v)} =0 
it follows from 2.8 that — S(k,v)} 0 asv—> every k. 


** The proofs are materially aided by two criteria: (i) If for every e > 0 there 
is an s of exterior measure < e making S + s approximately closed, then S is approxi- 
mately closed; (ii) If for every e, S= 8, te 8, where 8, is approximately closed and 
m,(8,) < e, then 8 is approximately closed. 
1° F, Riesz, footnote * (iii), p. 549, loc. cit., p. 244. 


ON UNIFORM CONVERGENCE. 


Thus for every & there exists a v; such that 
me{S — S(k,v)} < 


Define (v) : v1, v2, ¥3,° 


Now m.{S — U(v)} me( — S(k, m))} 


2.10. The problem of insuring approximate uniform convergence on the 
part of the sequence fn(2) thus resolves itself into the problem of insuring 
the sets S — S(k, v) to be approximately closed. If we assume that the set S 
is approximately closed then by 2. 6 the problem is further reduced to insuring 
that the sets S(k,v) are approximately closed. This clearly follows if the 
functions are all continuous, but the condition is much too restrictive. As in 
the case of sets, we remain as close as possible to the familiar notions of 
classical analysis and define a class of functions which, while much more 
general, are close enough to continuous functions to carry us through. 

A function f(z) is said to be approximately continuous ** on a set 9 
if for every « > 0 there exists a set s such that me(s) < and f(z) is con- 
tinuous on S —s with respect to S —s. 

We list some properties of approximately continuous functions which lend 
themselves to immediate proof. 


2.11. The sum, difference, product and quotient (if the denominator is 
different from zero except on a subset of measure 0) of two functions which 
are approximately continuous on a set S are again approximately continuous 


on 8, 


2.12. The absolute value of a function which is approximately con- 
tinuous on S is approximately continuous on 8. 


2.13. If the function f(x) is approximately continuous on an approzi- 
mately closed set S and if c is any constant, then the subset of S on which 
the inequality f(x) <c holds, is approximately closed. 


2.14. We may now show that if S is approximately closed and the func- 
tions f,(x) are approximately continuous, then the sets S—S(k,v) are 
approximately closed. 


*' The identity of functions approximately continuous on an interval and functions 
measurable on the same interval may be shown by using the theorem of Egoroff, or 
results of Borel and Hahn, footnote ? (iii), p. 549, loc. cit., pp. 246-247. 


557 


558 J. W. THEODORE SUCKAU. 


Since § is approximately closed it follows from 2.6 that it is sufficient 
to show that the sets S(k,v) are approximately closed. 
If Sn,m is the set of points of § where | fn(x) —fm(x)| <1/k, then 


oo 
S(k,v) = IT Sknm- Hence by 2.7 it remains but to show that the sets 


Si,nm are approximately closed. By 2.11 and 2.12 | fn(z) —fm(a)| is ap- 
proximately continuous and so by 2.13 s:,n,m is approximately closed. 


2.15. Combining 2.9 and 2.14 we now have the following theorem: 


If on an approximately closed set 8 we have a convergent sequence of 
approximately continuous functions fy(x), then the sequence is approximately 
uniformly convergent. 


The theorem is of course valid if the convergence of. the hypothesis is 
convergence almost everywhere. This is the celebrated theorem of Egoroff. 


2.16. An immediate application is the following: If fn(x) is a sequence 
of functions defined and approximately continuous on an approximately closed 
set S, and if the sequence converges almost everywhere to. a function f(z), 
then f(x) is approximately continuous. 


The proof is immediate by reducing the considerations to a slightly 
smaller closed set on which all the functions are continuous and the con- 
vergence is uniform. 

Hence, while in classical analysis it is not true that a sequence of con- 
tinuous functions defined and convergent on a closed set is uniformly con- 
vergent and the limit function is continuous, the above statement becomes 
true if the word approximately is inserted before the words continuous, closed 


and uniformly convergent. 


2.17%. It is to be noticed that as a consequence of 2.5 a function which 
is approximately continuous on an approximately closed set may be made 
approximately continuous on an interval. 


Suppose that f(z) is approximately continuous on an approximately 
closed set § lying in the interval (a,b). Then the function 


f(z) on 8 


0 elsewhere in (a, b) 


— { 


is approximately continuous on (a, b). 


| 

= 


ON UNIFORM CONVERGENCE. 559 


nt It follows that we need consider only functions which are approximately 
continuous on an interval. We now have the well known theorem: 


en 

A necessary and sufficient condition that a function f(x) be approximately 
a continuous on an interval (a,b), is that there exist a sequence of continuous 
p- functions defined on (a,b) and converging almost everywhere in this interval 

to f(x). 

The sufficiency follows from 2. 16. 

To prove the necessity take any convergent series of positive terms 
of mtme+s+:::. For every n it is true that there is an open set 0, such 
ly that f(z) is continuous on (a,b) —on—=C, and me(0n) <n. Define a 

f(z) on Cy 
is function ¢n(~) = + linear in the intervals of 0, and taking on continuously 

the values f(z) at the ends of these intervals. 
si Now cn(x) is continuous. Moreover, if ¢ is in only a finite number of 
ad the sets 0, then cn(é) > f(€), since then there is an n(é) such that for all 
n>n(&), € will be in C,, and so =f (é). 
The set of points in only a finite number of the sets is 
ly (Sor) ov) 
v= v= p=3 
co ie, 
But Sov contains Sov and me( Sov) < yn tan Since 
v=n v=n+1 v=n 
the series is convergent me(Z) = 0 and c,(x) f(x) for almost every z. 
we This characterization of approximately continuous functions is of im- 
se 


mediate importance, since now the theory of Lebesgue integration may be ) 
developed along lines similar to the work of Riesz,!* beginning with the well | 
oh known concept of the integral of a continuous function. The salient fact is 
that the Lebesgue theory of integration may be successfully presented with 
notions very similar to the familiar concepts of classical analysis (in fact 
within epsilon of these). 


2.18. Let us come back from the applications of the theorem of Egoroff 
to the first problem of this chapter. The problem was to find conditions 
powerful enough to insure approximate uniform convergence. One set of 
conditions has been given by Egoroff, and restricts not only the type of 


* F. Riesz, footnote * (i), p. 549, loc. cit. 


Vv 
‘ 


560 J. W. THEODORE SUCKAU. 


functions which make up the sequence f,(x), but also the set S on which they 
are defined. We find that it is unnecessary to restrict the set S. The proof 
of this fact is based on two auxiliary theorems, the first of which is a lemma 


of Sierpinski and Zygmund.*® 


Auxitiary THeoreM I. If a function f(x) is continuous on a set § 
then there is an approximately closed set M containing S and a function f* (x) 


continuous on M such that f*(x) =f (zx) on 8. 


Proof. Consider the closure of S which is S + S’=S*. Let M be the 
set of points of S* where the saltus *° of f(z) is zero. It is clear that M 
contains S. Moreover, S* is closed and so the subset S*, of its points where 


the saltus of f(z) is =1/k is closed. But M—T[] (S*—S*,) and so it is 
k=1 


approximately closed by 2.6 and 2.7%. The limit of f(x) at every point of M 
is unique. Define f*(x) to be equal to f(x) on S and the limit of f(a) at every 
point of M—8. UM is the set and f*(z) the function required by the theorem. 


Auxitiary THeEorEM II. If a sequence of continuous functions 
is defined on an approximately closed set M then the subset of M where the 
sequence converges 1s approximately closed. 


Proof. Define fn (2) = lim fn(2), °°; fm (a) and 
fn(@) — lim fn (2), (2), fm(x).”" 


These functions are approximately continuous by 2.16 for every value 
of n, and so lim sup fn(x) = lim fn(a) and lim inf = lim fn (x) are also 
approximately continuous. Hence (lim sup fn(2) —lim inf f,(z)) is an ap- 
proximately continuous function by 2.11, and it follows from 2.13 that the 
set of points where the above difference is zero is an approximately closed set. 
In other words the set of points of M where the limit exists is approximately 
closed. 

We are now in a position to prove that if a sequence of approximately 
continuous functions fx(x) ts defined and convergent on any set S, then the 
sequence 1s approximately uniformly convergent. 


1° “ Sur une fonction discontinue,” Fundamenta Mathematicae, vol. 4 (1923), p. 317. 
7° The saltus of f(x) at € is the least upper bound of the difference [lim sup f(4%,) 
— lim inf f(%,)] for all possible sequences (chosen from points of 8) é, and %, con- 
verging to &. 
*1 See footnote 6, p. 551. 


= 


of 


ON UNIFORM CONVERGENCE. 561 


Proof. Given « > 0 take a sequence of positive terms ¢, + €2 + ¢€3 +: °° 
=e/2. For every n there is an sp», of exterior measure < ey such that fn(x) is 
continouus on S —s,. Hence the functions (n = 1, 2,3,-- -) are all 
continuous on the set S — 3s, and me(3sn) < €/2. 


For every value of n, by Auxiliary Theorem I, there is an approximately 
closed set M, containing S —%s, and a function f*,(z) continuous on M, such 
that f*n(z) =fn(x) on S— Hence the functions f*n(z), (n = 1, 2, 3,-- -) 
are all continuous on the approximately closed set M — TIM, and M contains 
— 

By Auxiliary Theorem II the subset M* of M on which the sequence 
f*n(x) is convergent is approximately closed, and obviously contains S — sp. 

By the theorem of Egoroff, given ¢/2 there is a set s of exterior measure 
< «/2 such that the convergence is uniform on M*—vs. Hence the con- 
vergence of the sequence f*,(z) is uniform, a fortiori, on S— (Xs, +3). 
Since =fn(z), (n=1, 2, 3,- -) on S— and me(Ssn+8) <e 
the sequence f,(x) is approximately uniformly convergent on the set 8. 

It appears, then, that when we consider the elements of the class © from 
the point of view of measure we come to the conclusion that as long as the 
functions of the sequence fn(x) are approximately continuous on § there is 
always a set U as close in measure to S as we please. 


THE OHIO StTaTE UNIVERSITY, 
CoLuMBUS, OHIO. 


s | 
1e 
re 
is 
y 
d 
0 
)- 
e 
y 
y 
) 
l- 


ON THE MOMENTUM PROBLEM FOR DISTRIBUTION FUNC. 
TIONS IN MORE THAN ONE DIMENSION. 


By E. K. HAvILAND. 


The Hausdorff momentum problem‘ has recently been solved in the 
multi-dimensional case by Hildebrandt and Schoenberg.? In the present paper 
there will be treated the corresponding extension of the more general one- 
dimensional Hamburger’ momentum problem; i.e., finding a necessary and 
sufficient condition for the existence of a distribution function * @ such that 


where § denotes the entire (z,y)-plane and || Cnm ||, (n,m 1, 2,- - -), 
is a given real infinite matrix in which ¢o) = 1. 

The majority of methods, in particular those based on Jacobi matrices 
and continued fractions, seem inapplicable in more than one dimension. How- 
ever, the method of M. Riesz,* developed from the ideas of F. Riesz in con- 
nection with linear functionals, can be extended to the multi-dimensional case, 
and the purpose of the present paper is to carry out that extension, the proofs 
being given, for convenience, in the case of two dimensions. 

We consider the operation which makes correspond to any polynomial, 


N M : 
P(2,y) =X anmz"y™, 


n=0 m=0 
the number 


where || Cam || is a given real infinite matrix. The operation is seen to be 
distributive. It is said to be non-negative if P; = 0 provided P(z,y) is non- 
negative, i.e., provided P(z, y) = 0 for all (a, y), and in this case the matrix 


1For references to literature on the momentum problem in one dimension, ef. 
M. H. Stone, op. cit., pp. 613-614. References are collected at the end of the present 
paper. 

*T. H. Hildebrandt and I. J. Schoenberg, loc. cit. 

*The monotone absolutely additive set function ¢(EZ) is said to be a distribution 
function if 0< <1 and ¢(S) =—1, where S denotes the whole y)-plane. 
Cf. E. K. Haviland, loc. cit., p. 627. 

*M. Riesz, loc. cit., pp. 4-8. 


562 


| 
| com ff doyp (LE), (n,m = 0, 1,2," 
N M 
P. — > AnmCnm, 
n=0 m=0 
|| 


jon 


ne. 


MOMENTUM PROBLEM FOR DISTRIBUTION FUNCTIONS. 563 


|| Com || Will be said to be non-negative. The result of this investigation is 
then given by the 


TuEoREM. For the existence of a distribution function ¢(E£) such that 


(1) SJ, xy" dayp = Cnm; (n,m = 0, Coo = 1), 
it is necessary and sufficient that the matrix || Cnm || be non-negative. 


Proof. The necessary condition is immediately clear. For if a poly- 
nomial P(x, y) = 0 for every real (x,y) and if ¢(#) is a distribution func- 


tion, ff P(x, y) doyp(H) =0 and hence, if (1) is to hold, we must 
8 


have P, = 0. 

To prove the sufficient condition, we let Pi; : (&,7;), (47 =1,2,° °°) 
be a denumerable set of points dense in 8. In particular, we suppose them to 
be the intersections of sets of lines parallel to the codrdinate axes and every- 
where dense in the plane and let the functions gi;(z,y) be defined by 
Y) =1ife < & and y < nj; while gij(z, y) =0 otherwise. The opera- 
tion which makes P, correspond to the polynomial P(z,y) can be extended 
to the modul generated by finite linear combinations of 1, x, y, x, zy, y?,- °° 
Joi(@, y),* with real constants as coefficients in such 
a way that the operation remains distributive and non-negative, in the sense 
that to every non-negative function of this modul there corresponds a non- 
negative functional value. 

This extension is made step by step. We consider first the modul A, 
generated by the various powers z”y", (m,n =0,1,2,-- -), and by gi (2, y). 
To g1:(2, y) we attach, as the value of the functional, a number y,;, attaching 
at the same time to every finite linear combination of 1, 2, y,- - +, and g1:(z, y) 
the corresponding combination of Cio, Co1,° and yy. There is 
thus defined a distributive operation upon the modul A,. 

In order that the operation be non-negative, y,, will have to be so chosen 
that y11 S S where y,, is the upper limit of the values which the opera- 
tion makes correspond to all polynomials not greater than g:(z%,y) for any 
(t,y) and ¥,;, is the lower limit of the values which the operation makes 
correspond to all polynomials not less than g,,(z,y) for any (2,y). That 
such polynomials actually exist may be seen from the fact that f(z, y) =0 
belongs to the former class and f(x, y) =1 to the latter. The operation being 
distributive and non-negative in the modul of polynomials, we shall have ® 


The cases and 7,, < 7,, are associated respectively with the determi- 


d 
); 
es 
n- 
e, 
fs 
be | 
ix 
ef. 
ent 


564 E. K. HAVILAND. 


Yu If < we choose, for definiteness, y11 = Y11. In this way, 
the operation on the field A, is made non-negative as well as distributive, 
since then P(z,y) +4 91(2,y) = 0 implies Pe + = 90. 

We next form the modul A, by adjoining to the generators of A, the 
function g,2(z,y). To gi2(z,y) we assign as its value the number 2, the 
lower limit of the values corresponding to those functions of the modul A, 
which are not less than g:2(z, y) for any (z,y). We thus obtain a distributive 
and non-negative operation defined over the modul Az. Continuing in this 
manner, we extend the operation to the moduls A;, A,,- - - obtained by the 
successive adjunction of the functions g2:(2, y), 9:3(2, y),° to the modul A,, 

We then define a function F(z,y) at the points Pi; : (&,7;) by the 
equation F’'(é;,;) = yij, Where yi; denotes the value of the functional corre- 
sponding to the function gi;(z,y). This function F(z, y) possesses, on the 
points P;;, the monotone property in the sense of Radon.® For suppose 
&:, < &, and j, <j, Then from the definition of the functions 9;j;(z, y) 
it follows that for all (z, y) 


Zs ¥) — (2, Y) — Gigs, (2, Y)- 


Accordingly, as all these functions are included in some one of the moduls A; 
(and, of course, in all succeeding moduls), it follows that 


i. e., 


(2) 0S F (kin, 052) — F — F nic) + 


Thus F(z, y) possesses on the points P;; the monotone property, q.e. d. 
We shall next show that F(— «, y) F(z, — o) = 0, where 
F(— «,y) = lim F(z, y), the points (z, y) belonging to the sequence {Pi;} 
@=-00 


and the approach to the limit being uniform with respect to y, and a similar 
interpretation is to be placed on F(z,— ow). If & < 0, we have 
9:3 (2, y) S &-*2? for all j and all (z,y). It follows that 


0S P(&, 5) = S Cooks? 


wherefore, as £; > — o, lim F(é,7;) —0 uniformly for all »;, q.e. d. 
Similarly, if nj < 0, we have gi;(x, y) S j-*y?, so it may be shown in 4 


nateness or the non-determinateness of the momentum problem. For the one-dimensional 
case, cf. M. Riesz, loc. cit., p. 9. 
* Cf. J. Radon, loc. cit. I, p. 1304. 


Ay 


Te 
ij} 


lar 
ve 


14 


nal 


MOMENTUM PROBLEM FOR DISTRIBUTION FUNCTIONS. 565 


similar manner that as yj ->— o, lim F(é,;) =0 uniformly for all &. 
Again, gij(z, y) =1, wherefore 


OS F(&, 5) = S Coo = 1. 


With F(x, y) there may be associated an interval function y(J) defined 
for an everywhere dense’ set of intervals 


(4, = Nix = < Nie) 
by = F(&i,, 95.) — F — F (bis nin) + 


and the definition continues to hold when é;, = 1j,==— , in which case 
= F (Sis ni2), where Iigjg: (— 0 <2 < — < Me). 
From its definition y(JZ) is seen to be additive and we have already seen 
from equation (2) that it possesses the monotone property. In consequence, 
ash—>0, k lim F(é—h, exists, provided h = 0, k = 0 and the 
points (—h, »—), with perhaps the exception of (&,7), belong to the 
everywhere dense set of points for which F' is defined. 

We now extend the definition of F to all points (z,y) not belonging 
to the given everywhere dense set by setting F(z, y) = frye F(a4—h,y—k), 


=0,k=0 
where (« —h, y—k) is a point of the everywhere dense set and h = 0, k = 0, 
and we define for any interval J: (4, Sax < m3 4: Sy < y2) by 


(3) = F (22, y2) —F (22; y1) — yo) + 91). 


It is seen from the definition that y(J) is additive. Moreover, it is monotone, 
for there exist points (£4,,7i,), (ia Such that for any 
and y; < 


0S F (am, yn) — <6 (m,n = 1,2), 
and this, together with (2), implies 
0S F(x2, y2) —F (x2, 41) —F (41, y2) + P(t, 


Similarly, it may be shown that F(z, y), as thus defined for all points of 
the plane, is such that F(— 0, y) = — wo) while F(z, y) =1 for 
all (x,y). In fact, it will appear later that F(+ + —1. Hence y(/) 
is a bounded additive monotone non-decreasing interval function. It follows ® 


"Cf. E. K. Haviland, loc. cit., p. 628, Definition 4. 
*The proof is similar to that given by J. Radon, loc. cit. II, p. 1093. 


i 

y> 

@, 

ne 

1e 

ve 

ig 

he 

les 

he 

| 

he 

se 

y) 


566 E. K. HAVILAND. 


that its discontinuities, if any, fall upon a denumerable set of lines parallel 
to the codrdinate axes. In consequence, there is an everywhere dense set of 
points (£4, 7/;), which may be taken to be the intersections of two everywhere 
dense sets of lines parallel to the codrdinate axes, such that 

lim F(&;—h, —k) =F (£1,775), h=0, k=0, 


h=0,k=0 


where the points (€;—h, 7’; —k) belong to the same everywhere dense set 
as does (£i,7/;). Then there exists ° a bounded monotone absolutely additive 
set function ¢(#) whose corresponding point function, G(z, y), coincides with 
F(a, y) on the everywhere ‘dense set of points (&%, 4/;). Moreover, @(z, y) 
and F(z,y) have the same discontinuity points and are equal at all other 
points, i.e., y(Z) and ¢(#) are equal on all their non-singular rectangles. 
We shall show that ¢(/) is a solution of the momentum problem belonging 
to the preassigned matrix || Cnm ||. 

To this end, we consider a monomial *° z"y™, where n,m are arbitrary 
non-negative integers. Let 2r be a fixed even number greater than n + m, and 
choose — 7, < 0 and JT, >0 so that they belong to the set é,,é,° - -, and 
—T’, < 0 and 7”, > 0 so that they belong to the set m,72,: °°. Further- 
more, T’;, shall be so large that 


outside the rectangle R: 72; —T’;Sy< and on its 
boundary, « being a fixed arbitrarily small positive quantity. We divide R by 
lines 2, = &,=—T,, big’ = T, and = — 


Yo = = Nigga = into a set of rectangles 


< yi Sy < yin) 


in each of which the oscillation of z"y” is less than ¢’, where ¢ is another 
fixed arbitrarily small positive quantity. Let (Xz, ¥1) be a point in the 
interior of the rectangle Ry. We then form the step function v(z,y) which 
vanishes outside R and which takes the value X;"Y," in Ry. Then for 


every (z, y) 


Since the function v(z,y) belongs to one of the moduls A,, A2,: °° (and 


°Cf. E. K. Haviland, loc. cit., p. 651, and the references there given. 
*° The proof holds also for an arbitrary polynomial. 


| 


her 
the 
ich 
for 


ind 


MOMENTUM PROBLEM FOR DISTRIBUTION FUNCTIONS. 567 


hence to any subsequent modul), the functional operation is defined for it, 
and if to v(x, y) corresponds the functional value ve, 


(4) Ve — — €(Cor,o + Co,2r) S Cum S Ve + + €(Cor,o + Cor). 
Furthermore, 
v(2,y)= > Y)— Y)— Y) + Jini (ZY) J. 
Hence, as yi,j, = = F' (ax, yi), we have by (3) 

S 


The inequality (4) can then be written 


p 4 
(5) b Xu" Y (Rez) — —e(Cor,o + Co,or) = Cnm 


k=1 [=1 


Xu" (Rei) + + €( + Co,2r)- 


k=1 1=1 


Now a"y™ is continuous in z and y together in every rectangle and y(JZ) is a 
bounded monotone additive interval function. Hence,’ as the diameter of 
the Ry; approaches zero, 


At the same time, ¢ > 0, so that from (5) we obtain 
= Cnm = Sf (1) €(Cor,o 
R 
Let T,, T’,, T2, T’2 > + © ande—>0. Then 


As, however, for any arbitrarily large non-singular rectangle R, of ¢ and y, 
it follows that 2? 


* Cf., e. g., S. Bochner, loc. cit., p. 391. 
“This follows directly if m and n are both even. Otherwise one has first to use 


lel § 
ere 
set 
ive 
ith 
y) 
her k=1 
ing | 
ind | 
by 


E. K. HAVILAND. 


Sf dryp( EL) = Cam; q.e. d. 
8 


The distribution function @ whose existence is thus established is not 
necessarily uniquely determined. Sufficient conditions that ¢ be uniquely 
determined by its momenta Cnm have been found by V. Romanovsky ** and 
by the present author.’* In particular, ¢ is uniquely determined if the Cym 


are such that 
2n 
(28) | | — o(n), 


and the present author has shown ** that this condition is almost necessary in 
that @ may not be uniquely determined if o(n) is replaced by o(n***). 


REFERENCES. 


8. Bochner, “ Monotone Funktionen, Stieltjessche Integrale und harmonische Analyse,” 
Mathematische Annalen, vol. 108 (1933), pp. 378-410. 

E. K. Haviland, “ On the theory of absolutely additive distribution functions,” American 
Journal of Mathematics, vol. 56 (1934), pp. 625-658. 

T. H. Hildebrandt and I. J. Schoenberg, “On linear functional operators and the 
moment problem for a finite interval in one or several dimensions,” Annals of 
Mathematics, ser. 2, vol. 34 (1933), pp. 317-328. 

J. Radon, I: “Theorie und Anwendungen der absolut additiven Mengenfunktionen,” 
Sitzungsberichte der mathematischen-naturwissenschaftlichen Klasse der Kaiserl. 
Akademie zu Wien, vol. 122 (1913), pp. 1295-1438; II: “Uber lineare Funk- 
tionaltransformationen und Funktionalgleichungen,” ibid., vol. 128 (1919), pp. 
1083-1121. 

M. Riesz, “Sur le probleme des moments,” Arkiv fér Matematik, Astronomi och Fysik, 
vol. 17 (1923), No. 16. 

V. Romanovsky, “Sur un théoréme limite du calcul des probabilités,” Recueil Mathé- 
matique de la Société Mathématique de Moscou, vol. 36 (1929), pp. 36-64. 

M. H. Stone, “ Linear transformations in Hilbert space,” American Mathematical Society 

Colloquium Publications, vol. 15. 


THE JOHNS HOPKINS UNIVERSITY. 


the inequality of Schwarz. For the existence of the integral with respect to ¢, cf. 
J. Radon, loc. cit. I, pp. 1322-1324. Notice that if n =m =0,we obtain ¢(S) =c,, =1. 

18'V. Romanovsky, loc. cit., p. 47, § 3. 

* E. K. Haviland, loc. cit., p. 634. At the time of publishing that paper, the author 
was not aware that a proof of his Theorem I, under restricted conditions, and of the 
sufficient condition in his Theorem II had previously been given by Romanovsky. 
Romanovsky’s statement of this sufficient condition differs somewhat from the present 
author’s but the two proofs are effectively the same. 


not 
ely 
ind 
Com 


in 


A NOTE ON A PROPERTY OF FOURIER-STIELTJES TRANSFORMS 
IN MORE THAN ONE DIMENSION. 


By E. K. 


If p(€) is monotone in [— «0, + «] and p(— —0, p(+ 0) — 1; 
if A(t; p) “exp (ité) dp(€), where — <t< + o, denotes the Fourier- 


Stieltjes transform of p; and if, finally, M(f(-)) 
=+00 - 


it is known that Mt(| A(- p)|?)—=% | Ax |? where Ay p(& + 0)—p(& —0) 
and the summation is taken over all the (at most denumerable) discontinuities 
of p. In particular, if p is continuous, 


(1) M(| AC p)|?) = 0. 


In the case of more than one dimension, the discontinuity points need no 
longer be denumerable, so the question arises as to what then corresponds to 
the foregoing result. It turns out that in the multi-dimensional case a similar 
result holds, the point spectrum, as defined in an earlier paper,” playing the 
role of the discontinuities in the one-dimensional case, while the “mild” 
discontinuity points, i.e., those not occurring in the point spectrum, play 
no role at all. More precisely, it will be shown that in more than one 
dimension,*® if #(#) be a distribution function,‘ 


This result was stated, without proof, by Paul Lévy, Calcul de probabilités 
(Paris, 1925), p. 171. For a proof, cf. I. Schoenberg, “ Uber total monotone Folgen 
mit stetiger Belungungsfunktion,” Mathematische Zeitschrift, vol. 30 (1929), pp. 761- 
767, where reference is made to a paper of N. Wiener. Since then, (1) has often been 
rediscovered in connection with the unitary dynamics of Carleman and Koopman and 
with the statistical considerations of Khintchine. Cf. also A. Wintner and E. K. 
Haviland, “On the Fourier-Stieltjes transform,” American Journal of Mathematics, 
vol. 56 (1934), pp. 4-5. 

* Cf. E. K. Haviland, “On the theory of absolutely additive distribution functions,” 
American Journal of Mathematics, vol. 56 (1934), p. 654. 

* For convenience, we give the proof in the two-dimensional case. 

“The monotone absolutely additive set function ¢(#) is said to be a distribution 
function if 0< ¢(H) <1 and ¢(S8) —1, where S denotes the whole plane. Cf. E. K. 
Haviland, ibid., p. 627. For the definition of integrals with respect to such functions, 
ef. J. Radon, “Theorie und Anwendungen der absolut additiven Mengenfunktionen,” 
Sitzungsberichte der mathematischen-naturwissenschaftlichen Klasse der Kaiserl. 
Akademie zu Wien, vol. 122 (1913), pp. 1322-1324. 


569 


se,” 
can 
the 

of | 
n,” 
erl. 
nk- 
pp. 
sik, 

hé- | 
ety 

cf. 

1. 
hor 
the 
ky. 
ent 

|| 


E. K. HAVILAND. 


M(| A(s, 4)|*) 


where 
(3) A(s, 054) expli(se + ty) 
and 


M(f(s,t)) — lim Hs, ast, 


and the summation on the right of (2) is taken over all points P; of the 
point spectrum of ¢. 

While the result (2) is analogous to that in the one-dimensional case, 
it is not obvious from the latter, since in two or more dimensions the singu- 
larities of a monotone function are essentially more complicated than those 
of such a function in a single dimension, where all discontinuity points belong 
to the point spectrum. 

The proof of (2) is as follows: If ¢(#) be a distribution function, 
we define a set function $(F) by setting 


(4) ¢(£) =4¢(—BE), 


where — F is the set symmetric to Z with respect to the origin. Then $(£) 
is a distribution function, and,® by virtue of the Convolution Theorem for 
Fourier-Stieltjes transforms,® A(s,t; ¢ *¢) = A(s,t;) A(s, t; $) or, since 
A(s,t;) and A(s,t;¢) are conjugated complex quantities in virtue of 
(3) and (4), 3 

| A(s, t; ¢) |? =A(s,t3¢ *$). 
Consequently 
(5) M(| A(s, |?) — M(A(s, *4)) 


and we need examine only the latter. 

It is now to be shown that if y(Z) be a distribution function whose 
point spectrum is vacuous, then Mt(A(s,t;y)) —0. Since the contribution 
of the integration domain § — R to A(s,t;), where § represents the entire 
(z,y)-plane and FR an arbitrary rectangle in that plane having its sides 
parallel to the codrdinate axes, is in absolute value less than e for all (s, ) 
provided F is sufficiently large, it is sufficient to prove that for any fixed R 
and for sufficiently large values of 7, U 


*y,*¥, denotes the symbolical product (Faltung or convolution) of y, and ¥,. 
Cf. E. K. Haviland, loc. cit., p. 651, Theorem IV. 
° Cf. E. K. Haviland, ibid., Theorem V. 


570 
(2) 


the 


ose 
ng 


ose 
jon 
tire 
des 
t) 


Yo: 


A PROPERTY OF FOURIER-STIELTJES TRANSFORMS. 571 


aru) 1 SS, explivee + ty) as at | 


We begin the proof of this statement by observing that the expression beneath 
the absolute value signs may be written as 


(aru) + ty) (E) } ds dt, 


and it is permissible to invert the order of integration,’ obtaining 


M (NsinTz sin 
-{ (BE) 


where 


Il: Iv: (—8S258;—NSy<—8), 


V: —8SyS8). 


point spectrum of y, this last expression may be made less than ¢/5 in absolute 
value by taking 8 sufficiently small. 8 being thus fixed, it is easily seen that 


SS 
| (73) ff doy < €/5 


provided T is sufficiently large. Similarly, 


EL! 


if U is sufficiently large. Consequently, if the point spectrum of y is vacuous, 


MN(A(s, 0. 


Now 


ayv(H) and if (0,0) is not a point of the 


and 


€/5 


an 


"Cf. E. K. Haviland, ibid., p. 640. 


gue 
on, 
| 
B) 
4 
| 


572 E. K. HAVILAND. 


Furthermore,* every absolutely additive set function ¢ of bounded total 
variation is the sum of two functions, say ¢; and ¢2, of which the former has 
a vacuous point spectrum, while the latter is purely discontinuous (i.e., its 
spectrum coincides with its point spectrum). Then from the definition of ¢ 


it follows that = ote and 


M(A(s, —M(A(s, t5 (git (i 
+ M(A(s, *p2)) + ME(A(s, t; * 


But the point spectra of ¢, * 1, $i * pz and ¢; * p2 are vacuous by the addition 
rule ® for point spectra, so that the first three terms in the last member of the 


preceding equation vanish, and 


M(A(s, pe) A(s, 3 2)) —= ME(A(s, A(— 8, — $2)). 


Let points of the point spectrum of ¢, i.e., of ¢2, be Px: (2x, yx). As they 
are at most denumerable, it follows that 


A(s, #342) > + tye) ]¢(Px) 


A(—s,—t3 $2) Dexp[—i(sre + tye) 


Substituting in (6) and taking account of (5), we find 
On taking the mean value, all terms for which 74k disappear, so that the 


right-hand side of the preceding equation becomes : [ (Px) ]? which proves (2). 


THE JOHNS HOPKINS UNIVERSITY. 


*Cf. H. Hahn, Theorie der reelen Funktionen (Berlin, 1921), p. 414, Theorem xv. 
* Cf. E. K. Haviland, loc. cit., p. 654, Theorem VI. 


and 


THE THEORY OF THE SECOND VARIATION FOR THE 
NON-PARAMETRIC PROBLEM OF BOLZA.' 


By T. REID. 


1. Introduction. The non-parametric problem of Bolza in the calculus 
of variations is that of finding in a class of arcs 


(1.1) Yi = yi(2) +, ey), 


satisfying the differential equations and end conditions 


(1. 2) galt, y, =0 (a=1,---,m<n), 


one which minimizes an expression of the form 


Sufficient conditions for the problem of Bolza have been given by Morse 
[III] * and Bliss [IV] for extremal arcs that are not only normal relative 
to the end conditions but also normal on sub-intervals. Recently, sufficient 
conditions have been obtained under weaker assumptions by Hestenes [X], 
who has replaced the usual condition of Mayer by a new condition in terms 
of a certain quadratic form involving the solutions of the accessory equations. 
Hestenes has not only been able to discard the hypothesis of normality on | 
sub-intervals, but has also obtained sufficient conditions for an extremal are 
with multipliers of the form Ay» = 1, Ag(z) which is not necessarily normal 
relative to the end conditions. 
The principal result of the present paper is the following theorem: 


THeorEM A. If ys == Ao = constant, Aq(x), % StS 
s an extremal arc which satisfies the strengthened Clebsch condition, and on 
which there is no point conjugate to the point 1, then there exists a family 


* Presented to the American Mathematical Society, September 5, 1934. 

*Roman numerals in brackets refer to the bibliography at the end of this paper. 
Only papers to which direct reference is made in the present paper are listed. For a 
more extensive bibliography the reader is referred to that given by Hestenes at the 
end of [X]. 


573 


al 

ts 

| 

y 

| 


574 WILLIAM T. REID. 


of n mutually conjugate accessory extremals nij(x), (7 =1,° 
such that | on 2,22. 


This theorem is fundamental in the construction of a field of extremals 
imbedding a given extremal, and has been proved by several authors under 
additional assumptions of normality.* Theorem A has been proved by Morse 
by an extension of the methods which he used in [II].* The chief significance 
of the independent proof here given is that tt is a direct generalization of the 
method used by Bliss when the extremal arc satisfies additional normality 
conditions [I, pp. 729, 736], and hence ts more intimately related to the 
methods usually used in the simpler problems of the calculus of variations 
than the methods of Morse and Hestenes. 

Certain general properties of accessory extremals are discussed in § 2 
of this paper, and Theorem A is established in § 3. In § 4 there are proved, 
by the use of the results of § 3, further results concerning the existence of 
families of accessory extremals satisfying the condition of Theorem A. In 
particular, Theorem 4.2 gwes a rather elegant method for determining such 
a family of accessory extremals. It is to be noted, however, that the proof of 
the interesting result of Theorem 4.3 is independent of the results of § 3. 
Finally, in § 5 there is discussed briefly the relation of Theorem A to suffi- 
ciency theorems for the problems of Bolza and Mayer. 

Throughout the paper, the coefficients of (1.2), (1.3), and (1.4) are 
supposed to satisfy the hypotheses usually made {see [III], [IV] and [X]}. 


2. Accessory extremals. For an extremal arc Fy2: yi = yi(2), 
Ao = constant, Ag(x), 2; S let 


r= of (2; Y; ) + da (2) Y; y’), 


The coefficients of w and ¢q are supposed to have as arguments the functions 
Yi(©), Ao, Aa(w) belonging to H,2. It will also be supposed that F,2 satisfies 
the strengthened Clebsch condition [IV, p. 264]. As usual, this condition 
will be denoted as IIT’. If we set 


*See [I], pp. 729, 736; [II]; [VI], p. 320; and [X], pp. 804, 807. 

*I was not aware that Morse had proved this result until the date upon which 
I presented my proof to the American Mathematical Society. Morse’s paper has since 
appeared in the Transactions of the American Mathematical Society, vol. 37 (1935), 
pp. 147-160. Hestenes has informed me that subsequent to my proof of Theorem A, 
he also proved this result by the use of the formulation of the Mayer condition which 
he has used in [X]. 


i 
| 


_ Speak of an accessory extremal, or else to use the terms secondary differential system, 


THE NON-PARAMETRIC PROBLEM OF BOLZA. 


(2.2) 9, 7,4] = 9, 7°] + paPal 2, 
then the system of accessory differential equations is 


By the introduction of the canonical variables ; = Q,,[z, n, 7’, »], this system 
is seen to be equivalent to a system of 2n linear differential equations of the 
first order of the form 


(2.3’) = Aas + Bas = Cig — Aja (a) 


The coefficients in (2.3’) are continuous on 2%, || Bi; || and || Ci; || are 
symmetric matrices, and || Bj; || is of rank n—m on 222.5 We shall say 
that a set of functions 4:(z), £:(z) which are of class C’, and which satisfy 
(2. 3’) on 2,22 is an accessory extremal.® 

If the extremal H,, is a minimizing arc which is normal on every sub- 
interval 7372 of 2,72 it has been shown by Bliss that there can be no point 
conjugate to 1 on H,2 between 1 and 2.7 We shall use IV, to denote this 
necessary condition, and IV’, to denote the condition that there is no value x 
such that 2, < x; = x, and defining a point 3 conjugate to 1. 

We shall say that the order of anormality * of H#,. on a sub-interval 1,¢, 
of 2,72 is equal to r if on this sub-interval there are exactly r linearly in- 
dependent accessory extremals i = wix(2), = (kK =1,- +--+, 17) with 
Wx =0 on 

The following properties of accessory extremals will be given without 
proof, 


Property 1°. The order of anormality of Ey. on a given sub-interval 
is at most m. 


*See, for example, [VIII], §§ 3 and 4. 

*The terminology accessory differential system for the system (2.3) is due to 
von Escherich. The problem of minimizing the second variation in a class of arcs 
satisfying the equations of variation has been called the accessory minimum problem 
(III, and XJ, and the associated boundary value problem has been termed the accessory 
boundary value problem [III, VIII and X]. On the other hand, a set of functions 
",(@) belonging to a solution Nis My, Of (2.3), or toa solution ni, §, of (2.3’), has 
been called a secondary extremal. [III, and X]. It seems more consistent to either 


secondary minimum problem, secondary boundary value problem, and secondary extremal, { 
Due to the priority of the term accessory for the differential system, the present 
author has adopted the phrase accessory extremal in the sense defined above. 

"See [I], p. 725. The reader is referred to [I] for the definition of conjugate point. 

*This terminology has been used by Hestenes, [X], p. 799. 


8 


575 

n) 
der 
nce 
the 
lity 

the 
ons 

§ 2 
ed, 

of 

In | 
uch 

of 
3. 
ffi- 
are 
ons 
fies 
‘ich 

nce 
5), 

A, 
ich 

| 


576 WILLIAM T. REID. 


Property 2°. If the order of anormality of Ey. on a sub-interval t,t, 
is r, and fi (kK are linearly independent acces- 
sory extremals on this sub-interval, then for arbitrary admissible variations 
n(x) and arbitrary points 2’, x” of t,t,, we have 


vin (2) (x) | 0 


We shall denote by r(z) the order of anormality of /;, on the sub-interval 
<< of The function r(z) is seen to be monotone non- 
increasing on 2, << 42. In view of Property 1°, we have 


Property 3°. There exists a constant d such that 0 < dS 2, — 4, and 
r(x) is constant ona, << 


The following property is a consequence of the continuity of the solutions 
ni, i of (2. 3’): 


Property 4°. If 7, <%= 22, there exists a 8 such that0 <8 < 4,;—2, 
and r(x) =r(a3) on 23 —8 SUS 


In view of the above properties, it is seen that r(z) has at most m points 
of discontinuity on 7; << «#22. We shall denote these points by t,,- - -, ty, 
where < ty < ty, t, < For convenience, let = 21, to = 22, 
and tg=r(tg) 

It is readily seen that one may choose a family of accessory extremals 
(7 =1,° such that 


= 0, =O on ty for 
(2. 4) 
(L1) (21) = $j; (65 = 1, =0 if (1,j— 


Now define another set of n accessory extremals u; nej(X), Vinsj(Z) by the 
initial conditions 


(2. 5) moj = Vij (21), Vi (41) = 0 
Finally, define wi;(z|q), vij(x|q) (g as follows: 


| q) | n+j (2), Vij (x | q) n+j (2) for j= 


(2. 6) 


The following property follows readily from the definition of a conjugate 
point: 


| 
| 
| 


THE NON-PARAMETRIC PROBLEM OF BOLZA. 577 


Property 5°. A value on the sub-interval ta, (q=0,1, 
-++,g) defines a point on Ey. conjugate to the point 1 if and only if one 
of the following conditions is satisfied : 


(a) the matrix | uia(21) (s =1,---+,2n) has rank less than 2n — rq. 
| Wis (2s) 


(8) the matrix || Uim(as) || = || (m= has 
rank less than n — fq. 
(y) the determinant | ui; (23 | q)| is equal to zero. 


It is verified readily that the accessory extremals of each of the n parameter 
families defined by (2.4), (2.5) or (2.6) are mutually conjugate [see I, 
p. 738] in pairs. 


38. Proof of Theorem A. For clarity, the essential steps in the proof 
of this theorem will be stated in the form of lemmas. 


LemMA 3.1. In terms of the accessory extremals defined by (2.4) and 
(2.5), define n mutually conjugate accessory extremals Ui;(%;p) Vij p) 
as follows: 


(©; p) = Ui + (x) Vis p) = Vi + pris (Z), 
Uij (3p) = Vis (23 p) = for 


Then if Ey, is an extrémal satisfying the conditions of Theorem A, the 
determinant | Ui;(x;p)| is different from zero on x1 << @S 2, for p positive 
and sufficiently large in value. 


This lemma will be proved by induction. In view of condition (y) of 
Property 5° of § 2, it is seen that | wj(z|q)| #0 on ton < tS tq 
(q=0,1,---,g). Now for z,<2St, we have Ui;(r;p) 9), 
and hence for arbitrary values p we have | Ui;(z;p)| ~0ona, <@@Sty. 

It will now be proved that if for a value g =o there exists a positive value 
p=p; such that | Ui;(x;p.)| #0 on < te, then there exists a value 
p2 > p, such that | Uij(x; p2)| AO on < By hypothesis, 
| p:)| ~ 0 on < 2S te. Hence there exists an such that 
le<te+e< te, and | Uij(@3p1)| AO on It will first 

be shown that if p > p,, then | Ui;(z;p)| 0 also For 
if there were a value 2, on this interval such that | Uis(ts; p)| 0, there 
would exist constants c; not all zero and such that Ui;(233p)c; =0. For 
these constants, let 


ite : 

e8- 
ns 

val 

n- 

nts 

ty, 

als 

he 

ate 


578 WILLIAM T. REID. 


i(x) = Vis (2; p) ey 


(3.2) (x) = p) ej, 


Since on 2,t, we have Ui;(r;p) =Uij(x3:), and | Uij(x3pi1)| 40 on 
2, <2£Ste+e, it would follow that the functions defined by (3. 2) 
are of the form = Uij (2; p1)aj(@) on and the functions 
a;(x) are of class C’ on this interval. Moreover, on 2,t, we have a;(x) =c; 
(j=1,---,n). In view of condition III’ and the Clebsch transformation 


of the second variation [I, p. 739], it would follow that 


On the other hand, by direct integration we obtain 


2o[2, 4, 9 = — (21) (21), 


and as a consequence of the initial values of Ui;(x; p) and Vis (2; p), it would 
follow that 


@3 Tg 
(3. 4) f x, 9, + (21) Vis (213 = (p — ¢;*. 
j= 


Relation (3.4) is seen to be a contradiction to (3.3) unless c,=0 
(r=1,---,7,). In this latter case, since | Vij (rs; p1)| 40 it would follow 
that cj; —=0 (j—1,---,”), which is a contradiction. We have proved, 
therefore, that if | on << then for 
we have | Ui;(2;p)| 0 on this interval. 

Finally, it is to be noted that on tt-+eSxS to, the determinant 
| Uis(x;p)| is a polynomial in p of degree ry —re-1 whose leading coefficient 
is | wij(x;a@—1)|, and therefore different from zero. Hence for p sufficiently 
large in absolute value we have | Uij(z;p)| 40 on te+eSeS ton 
Combining these results, we have that there exists a positive value p, such 
that p2 > p, and | Ui;(r;p2)| AO on << We have established, 
therefore, an induction proof of Lemma 3. 1. 

In order to complete the proof of Theorem A, let 2 = 2, — (ty —%); 
and define the coefficients of » and ® on 7,S2< 2, by the following 


identities in (2, 7, 7’): 


(3. 5) 1] o[ 2a, — #,[ 2, 22, 1]. 


In the canonical system (2. 3’) we then have for 72 < 1%, 


| = 
i 


THE NON-PARAMETRIC PROBLEM OF BOLZA. 


(3. 6) =—Aij(22,— Bij (x) = Bij(241— 2), 
(x) = C4; (227, — 


If ni =0, is a solution of (2. 3’) on then =0, (24, — 
is a solution of this system on 2,t,, and conversely. Hence on every sub- 
interval 2,2, of 2 2, the order of anormality is ry. Moreover, on Zo2%;, the 
condition III’ holds. 

It is to be emphasized that after the coefficients of the system (2.3), or 
(2. 3’), have been defined on 22, by relations (3.5), these coefficients will in 
general be discontinuous at x 2;. Hence, by a solution of this system on 
we shall understand a set of functions which are continuous 
on this interval, and which, except possibly at z,, have derivatives satisfying 
equations (2. 3”). We shall still denote by wij(x), % am, the 
solutions of this modified system on x x. which satisfy the initial conditions 
(2.4). A set of ry linearly independent solutions with y;(z) =0 on a2, is 
given by the functions ui,(x), vir(z) (r= 79). 

Now let v*ij(z) (7 =1,:°+,) be continuous functions such that 


= Vij (%1), and 


(8.7) (2) on aon, 1f rj 1,-- +, 793; 1,- 


Moreover, let p be such that | Ui;(z;p)| on 
Corresponding to a point ¢ on 2 a, there exist mutually conjugate solu- 
tions t), t) of (2. 3’) on such that 


mis(t3t) =vis(t), ¢) = pv*ij(t) for (7 =1,---, 79), 


For t=, the initial conditions (3.8) reduce to the initial values of 
Uij(%;p), Vis(23p). Hence, i; = (25 p), = Vis (2; p) 
ON 

It will now be shown that for ¢ < 2, and sufficiently near to 2, the 
determinant | 4i;(z;t)|A0 on 2,22. It will first be noted that on a sub- 
interval where S 2’ < a, and 2,2’ the order of anormality 
is ry. The following lemma may then be proved by the method used by Bliss 
[I, p. 739] to prove a corresponding result for an are which satisfies stronger 
normality conditions. 


Lemma 3.2. There exists an interval 4, 
(9<d<t,—zx,) such that if n(x), f(x) is a solution of (2.3’), and 
the functions ni(x) all vanish at a point 2 of 4,—dSx< 2, and at a 


579 
on 
2) 
ns 
Cj 
on 
0 
OW 
ed, 
ont 
tly 
ich 
ed, 
ng 


580 WILLIAM T. REID. 


point x” of +d, then on wx” we have 4i(x) =0, = vir (2) 
where cz (r=1,- are constants. 


From the initial values of the functions v*;;(z) it is seen that there 
exists a 8 such that 0 < 8 < d, where d is determined as in Lemma 3. 2, and 
such that if then the determinant | 
(r=1,---,%;0—1+1,:--,m) is different from zero. It then follows 
from the result of Lemma 3. 2, that if ¢ is an arbitrary pointon z,—8St <2, 
and ij(z;t), £ij(x;t) is the corresponding family determined by the initial 
conditions (3.8), then | 40 ona, 

Finally, since = Ui;(z;p), it follows from the continuity of 
the solutions €ij(x;t) of (2. 3’) when considered as functions of 
that for ¢ sufficiently near to x, the determinant | yi;(x;t)| is also different 
from zero on 4; +dS=2x=2,. Therefore, for ¢ << z, and sufficiently near 
to z, the family of mutually conjugate solutions of (2. 3’) determined by the 
conditions (3.8) is such that | on This com- 
pletes the proof of Theorem A. 

The following corollary is an immediate consequence of well-known results 
for the problem of Lagrange: 


CoroLLARY. Suppose Fy, is an extremal arc which satisfies the conditions 
of Theorem A. If ui(x), vi(x) 18 an accessory extremal, and (x) 1s an 
arbitrary admissible variation such that i(21) = ui (21), ni (Xe) = Ui (Le), then 


&. 


and the equality sign holds if and only if yi(x) =ui(@) on 22>. 


4. Further discussion of Theorem A. It has been established in 
Lemma 3.1 that along an extremal are which satisfies the conditions of 
Theorem A the accessory extremals Ui;(x;p), Vij(%;p) defined by (3.1) 
are such that | Ui;(x; p)| #0 on 2; < «Sz, for p > 0 and sufficiently large 
in value. If rz ro, that is, if the order of anormality of F,. is the same 
on every sub-interval 2,2 of 2,22, it is seen from the first paragraph of the 
proof of Lemma 3.1 that this condition is true for arbitrary values of p. 
For further discussion we shall assume 7, > 19, and seek to determine a lower 
bound for the values of p which are such that the conclusion of Lemma 3. 1 
is satisfied. This lower bound is not determined independent of the results 
of § 3, however, since use is made of the Corollary of that section. 

In view of condition IV’, and condition (a) of property 5° in § 2, it 18 


THE NON-PARAMETRIC PROBLEM OF BOLZA. 581 


seen that the — ro accessory extremals uix(@), (Kk = 1 + +, 2n) 
defined by (2.4) and (2.5) are such that the matrix 


Wix (21) 


is of rank 2n —vrp. As a consequence of property 2° of § 2, or by a direct 
consideration of the initial values wie(z,) =1,° - -,2n), it is seen that 
the matrix of 2n — 1, rows and 2n — ry columns 


Wix (2) 


is of rank 2n — ry. 
For simplicity of notation, let 2A[z] == (x, A + 1,° +, 2n) 
denote the quadratic form 


(4. 2) f Win dey W ix = Vix (L)Vin(L) | 


We shall also denote by By[z] = (v 2n—ry) the first members 
of the linear equations 


Vio (1) Uix(L1) = 0, 


4, 
( Wix(L2) & = 0, 


Finally, denote by 2K [z] = kx, 22, the quadratic form 


(4. 4) [ Win (21) Win(21) Wix (22) Win( Le) 


which, in view of IV’, and elementary properties of matrices, is positive 
definite. It then follows that the class of values (z,) which satisfy the 
conditions 

(4. 5) By[z] = 0, 2K[z] 


is not vacuous. Moreover, the minimum value of 2A[z] in this class of values 
is the smallest zero 1, of the determinant ® 


* See, for example, Hancock, Theory of Maxima and Minima, Ginn and Co., Boston 
(1917), pp. 103-114. Bliss has phrased his analogue of the Jacobi condition for the 
problem of Bolza in terms of the roots of a determinant of the form (4.6); see [IV], 
p. 273. It may be shown that corresponding to each of the zeros 1 = L, (h=1,--.-, 
of D(l) there exists an accessory extremal Nin satisfying with 
constants d,, the end conditions: 

See [IX], § 5, and also [V]. 


Te 
nd 
)| 
ws 
ial 
of 
t, 
nt 
ar 
he 
ns 
en 
in 
of 
1) 
ge 
ne 
he 
er 
Its 
is 


582 WILLIAM T. REID. 


bux 


(4. 6) Dit) 


Finally, in view of IV’, we have that if 4:(2) is an arbitrary admissible 
variation, then there exists a unique accessory extremal ui = Uix(Z)%, 
V4 = Vix (Z) 2%, such that =i (21), Ui (%2) =i [X, p. 809]. As 
a consequence of the minimizing property of J, and the corollary of § 3, we 
have that if ;(z) is an admissible variation such that 


(4. 7) Vio(X1) (21) = 9, i = 9, ni (21) ni AO, 


then f 2w[ a, |dx = L [i (21) i (21) |. 


Now suppose that for a given value of p there exists a point x; such 
that 7, < and | Uij(%3;p)| = 0. If c; (j =1,- -,m) are constants 
not all zero and such that Ui; (23; p)c; = 0, it follows from IV’, that cc, ~ 0, 
(r=1,--+,7,). Moreover, the admissible arc defined by 


satisfies equations (4.7). On direct integration, we obtain 


9, = — = — (41) J, 


and we must have, therefore, p= —1J,. We have established, therefore, the 
following theorem: é 


THEOREM 4.1. Suppose that H,. is an extremal arc satisfying the con- 
ditions of Theorem A, and Ui;(x;p), Vij(x3p) ts the system of accessory 
extremals defined by (3.1). Then for p > —1,, where |, 1s the smallest zero 
of the determinant D(1), we have | Ui;(x;p)| KO on a. 


Finally, there will be given a method for determining a system of mutually 
conjugate accessory extremals 4i;(x), £:;(2) with | 40 on which 
does not use directly the system defined by (3.1). 

Consider the problem of minimizing 2A[z] in the class of values (%) 
(x -+1,:--,2n) which satisfy the conditions 


(4.8) 


Uix(L2) = 0, 2K[z] == 1, 


where A[z] and K[z] are defined by (4.2) and (4.4). The minimum value 
of 2A[z] in this class of values is the smallest zero m, of the determinant 


Ky 


\s 


he 


ne 


THE NON-PARAMETRIC PROBLEM OF BOLZA. 


Uin 0 
By the use of the Corollary of § 3, and by an argument like that used 
above, it is seen that if (x) is an admissible variation such that 


= 0, 94 (21) (21) 0, then 


(4. 9) D(m) = 


In terms of the accessory extremals (2.4) and (2.5), now define a system 
of mutually conjugate accessory extremals Y;i;(r; p), Zi;(3 p) as follows 


(4.10) = Wins + puis (U3 p) = Vines (2) + (2). 


Suppose that for a given value of p there exists a point x; such that 
Sz and | Yij(x3;p)| Let dj (7 —1,- - be constants not 
all zero such that Yi;(23; p)dj —0. If we define 


ni(@) on i (x) =O on 


then (%1)i (21) = did; Moreover, by direct integration we obtain 


Hence, if p is a value such that | Yi;(2; p)| = 0 at a point 2, on < 
it follows that p= —m,. Finally, it is to be noted that 


| p)| =| (21) | FO. 
We have, therefore, the following theorem. 


THEOREM 4.2. Suppose that Ey. is an extremal arc satisfying the con- 
ditions of Theorem A, and that Yi;(x;p), Zi;(x;p) is the family of mutually 
conjugate accessory extremals defined by (4.10). Then for p > —m,, where 
m, 1s the smallest zero of the determinant D(m), we have | Yi;(x; p)| #0 


In Theorem 4. 2 we have assumed that 7. is an extremal satisfying the 


' conditions of Theorem A, that is, it satisfies the strengthened Clebsch con- 


dition and has on it no point conjugate to the point 1. Consequently, in the 
proof of Theorem 4.2 use is made of the Corollary of §3. It is significant 
to note, however, that quite independent of the results of § 3 the above proof 
of Theorem 4. 2 leads to a result which is of itself important. 


583 
ve 
ch 
its 
0, 
n- 
ry 
ro 
ly 
ch 


584 WILLIAM T. REID. 


In the proof of the above theorem explicit use is made of only the fol- 
lowing hypotheses: (1) Ei. is an extremal arc satisfying the strengthened 
Clebsch condition, (2) the point 2 is not conjugate to the point 1, and 
(3) if u(x), vi(x) is an accessory extremal and (x) is an arbitrary 
admissible variation such that yi (21) = = Ui (22), then the 
inequality (3.9) holds. Let ya denote the multipliers corresponding to the 
accessory extremal u;,v;. Then by an expansion of the integrand function 
2w[ x, n, = 2Q[2, n, 7’, by Taylor’s formula, and an integration by parts, 
one obtains that assumption (3) is equivalent to the condition that if (2) 
is an arbitrary admissible variation such that 4i(21) = 0 then 


(4. 10) f 9, = 0. 


In view of this remark we have, therefore, that quite independent of § 3 
the above proof of Theorem 4. 2 leads to the following result: 


THEOREM 4.3. Suppose E,. is an extremal arc which satisfies the 
strengthened Clebsch condition, on which the point 2 is not conjugate to 
the point 1, and such that if mi(x) ts an arbitrary admissible variation 
along Ey. having =0 (22), then inequality (4.10) is satisfied. 
Then for p > — m,, where m, is the smallest zero of the determinant D(m), 
we have | Yi;(x;p)| ona, SxS ap. 


It is to be remarked that one may show the equivalence of the hypotheses 
of Theorems 4.2 and 4.3 by the use of the results of § 3 and the first neces- 
sary condition for the problem of Lagrange. However, the significance of 
Theorem 4. 3 lies in its simplicity of proof and its usefulness. For example, 
with the aid of this theorem one may establish in a simple manner the 
existence of infinitely many characteristic numbers for the boundary value 
problem which the author has previously treated by the use of other methods 
[see V and IX]. 


5. Relation of Theorem A to sufficiency theorems for the problem of 
Bolza. By the use of Theorem A the proof of a sufficiency theorem for the 
problem of Bolza as given by Bliss [IV] is still valid when the normality 
conditions there used are replaced by the weaker assumptions of normality 
with respect to the end conditions and normality on 2,22. This latter as- 
sumption of normality is used by Bliss not only in the proof of the imbedding 
theorem, but also in the proof of another result [IV, Lemma 2, p. 267]. 
Hence, Theorem A does not eliminate entirely the assumption of normality 


THE NON-PARAMETRIC PROBLEM OF BOLZA. 585 


on 2,22 in the proof of sufficient conditions as given by Bliss. The result of 
Theorem A also enables one to establish sufficiency theorems for the problem 
of Mayer by the methods used by Bliss and Hestenes [VI] and Hestenes site 
under correspondingly weakened hypotheses. 

In a recent paper [IX] the author has discussed for the problem of ~~ 
the relations between the boundary value problem formulation of the analogue 
of Jacobi’s condition and the analogue of this condition introduced by Bliss 
{see [IV] and [VII]}. As a consequence of Theorem A the results of 
Theorems 4.1, 4.2 and 4.4 of that paper are valid when the assumptions 
concerning normality on sub-intervals is omitted. Consequently, these results 
apply directly to the problem of Bolza as well as to the problem of Mayer. 
In particular, using the terminology of that paper, we have: 


If is an extremal of the form yi = yi(@), Ao = 1, which satisfies 
conditions I and III’, then condition IV’¢ is satisfied if and only tf conditions 
IV’, and IV’: are satisfied. 


For the problem of Bolza as treated by Hestenes [X], Theorem A leads 
to the following results, when expressed in Hestenes’ notation [ X, pp. 810-811]. 


If g ts an extremal satisfying conditions I and III’, then condition V’ 
is satisfied if and only tf condition VI’ is satisfied along g. Again, using the 
notation of Hestenes, we have the following relation between the conditions 
IV’c, IV’, and IV’s of [IX], and conditions V’ and VI’ of Hestenes {see [X], 
pp. 810-816}. 


Suppose that the end conditions are regular, and that the non-tangency 
condition holds on an admissible arc g having no corners. If g satisfies con- 
ditions I, III’ with a set of multipliers 5 =1, dAg(x), then each of the four 
following conditions implies the others: IV’c, IV’, with IV’p, V’, VI’. In 
case the end conditions are separated, then each of the above conditions is 
equivalent to the condition IV’ of Hestenes [X, p. 806]. 


The above results lead to obvious simplification of the sufficiency theorems 
as stated by Hestenes. 


ol- 
ed 
nd 
ry 
he 
he | 
on 
ts, 
x) 
he 
on 
d. 
of 
le, 
he 
ue 
ds 
of 
ity 
ity 
as- 
ng 
7]. 
ity 


WILLIAM T. REID. 


BIBLIOGRAPHY. 


I. Bliss, “The problem of Lagrange in the calculus of variations,” American 
Journal of Mathematics, vol. 52 (1930), pp. 673-747. 

II. Morse, “ Sufficient conditions in the problem of Lagrange with fixed end 
points,” Annals of Mathematics, ser. 2, vol. 32 (1931), pp. 567-577. 

III. Morse, “ Sufficient conditions in the problem of Lagrange with variable end 
conditions,” American Journal of Mathematics, vol. 53 (1931), pp. 517-546. 

IV. Bliss, “The problem of Bolza in the calculus of variations,” Annals of 
Mathematics, ser. 2, vol. 33 (1932), pp. 261-274. 

V. Reid, “ A boundary value problem associated with the calculus of variations,” 
American Journal of Mathematics, vol. 54 (1932), pp. 769-790. 

VI. Bliss and Hestenes, “ Sufficient conditions for a problem of Mayer in the 
calculus of variations,’ Transactions of the American Mathematical Society, vol. 35 
(1933), pp. 305-326; Contributions to the Calculus of Variations 1931-1932, The 
University of Chicago Press, pp. 295-337. 

VII. Hestenes, “Sufficient conditions for the general problem of Mayer with 
variable end points,” Transactions of the American Mathematical Society, vol. 35 
(1933), pp. 479-490; Contributions to the Calculus of Variations 1931-1932, The 
University of Chicago Press, pp. 339-360. 

VIII. Hu, “The problem of Bolza and its accessory boundary value problem,” 
Contributions to the Calculus of Variations 1931-1932, The University of Chicago Press, 
pp. 361-443. 

IX. Reid, “ Analogues of the Jacobi condition for the problem of Mayer in the 
calculus of variations,’ Annals of Mathematics, ser. 2, vol. 35 (1934), pp. 836-848. 

X. Hestenes, “ Sufficient conditions for the problem of Bolza in the calculus of 
variations,” Transactions of the American Mathematical Society, vol. 36 (1934), 
pp. 793-818. 


THE UNIVERSITY OF CHICAGO, 
CHICAGO, ILLINOIS. 


| 586 


CONCERNING SOME METHODS OF BEST APPROXIMATION, 
AND A THEOREM OF BIRKHOFF.' 


By I. M. SHEFFER. 


Introduction. The series of Taylor has been generalized in many di- 


1 rections. It is our purpose here to consider a natural extension of certain 

P series in addition to that of Taylor, by methods analogous to those used by 

| Birkhoff *? and by Widder * for Taylor series. 

of The Widder theory is based on a best approximation definition. Let 

(1 =0,1,-- -) be an infinite sequence of functions, analytic about 
a=0. An expression of the form s,(z) = > cidi(x) is a “ polynomial ” of 

1€ 0 

35 order (not exceeding) n. According to Widder, the “polynomial” s,(z) is 

he the best approximation of order n to the function f(z), analytic about z = 0, if 

th f(z) —8n(x) vanishes, together with its first n derivatives, ata==0. If certain 

35 determinants do not vanish, then s» exists and is unique. The question of con- 

h 

vergence of s,(z) to f(x) is equivalent to that of so(x) + [sn(x) — sn-1(x) 

” 1 

- to f(z). Now a striking situation comes to light: s»,—s,-. can be written 

in the form 

she — Sn1(Z) = fnQn(Z) 

of where 0,(z) is a “ polynomial” of order n, independent of f(x), the func- 

4), 


tion f(z) making its presence felt only in the constants {fn}. That is, 


+ > [Sn(@) — ~ fan (2) ; so that in going from the n-th 


approximation to the (n + 1)-st, we have only to add in the term fins: Qn.1 (2), 
the terms already present remaining. Such behavior (which is also true of 


the extension we have in view) we shall refer to as the property of permanence. 
In the Widder case, 


On (2) 


= (2"/n!)[1 + ha(2)], 


* Presented to the American Mathematical Society, December, 1933. 
*G. D. Birkhoff, “ Sur une généralisation de la série de Taylor,” Comptes Rendus, 
vol. 164 (1917), pp. 942-945. 

*D. V. Widder, “ On the expansion of analytic functions of the complex variable in 
generalized Taylor’s series,” Transactions of the American Mathematical Society, vol. 31 
(1929), pp. 43-52. (This paper contains a reference to a preceding work by Widder in 
Which functions of a real variable are considered; but that phase of the problem does 
not concern us here.) 


587 


588 I. M. SHEFFER. 


where h,(z) is analytic at cr 0 and h»(0) —0; and his convergence theory 
is based on the following additional hypotheses: 


(i) hn»(x) is analytic, 
(ii) M exists, independent of n and z, such that | ha(x)| S M/(n +1), 
|e | SR. 


In §1 we give a rather general definition of best approximation, and 
establish the property of permanence. As particular cases are the Widder case, 
and the “least square” case. In § 2 we show that condition (ii) of Widder 
can be replaced by the less restrictive condition: 


(ii’) 1+ converges uniformly, in | = R, to a function M(z), 
with M(z) in =R. 


Essentially, we replace Widder’s condition | hn(z)| = 0(1/n) by the con- 
dition | hn(z)| = 0(1). Finally, in § 3, to handle convergence in some other 
cases, we appeal to the method used by Birkhoff, which by means of an integral 
equation extends the convergence properties of Taylor series (i.e., series in 
{z"}) to series in {vn(x)}, where {vn(z)} is “sufficiently close” to {z"}. 
Only, the réle of {z"} is now played by functions {u,(z)}, which we endow 
with properties analogous to those of the known functions {2}. 


1. Some methods of best approximation. Let {¢n(x)}, (n =0,1,- °°) 
be a sequence of functions. We wish to assign to each function f(x) (of a 
certain class) a sequence of “ polynomials” {s,(2)}: 


(1) Sn(L) = + + Cnundn(Z), 


which are the best approximating “ polynomials” to f(z), each of its order. 
This requires that we define the test for best approximation. 
Let Ln, (n =0,1,-~- -) be a sequence of linear operators, each of which 


assigns to a function a number: L,[u(x)] = u,. 


DEFINITION. By the method In of best approximation, relative to the 
set of operators Ln, is meant that determination of the set {sn(ax)} according 
to the following test of best approximation: * 


Li[8n(x)] = Lil[f(x)], 


(2) 


*The “ polynomials ” s, (#) depend, of course, on the sequence of functions {%, (a) }. 


ti 


| 
| 
| 


METHODS OF BEST APPROXIMATION. 589 


DEFINITION. A method In is non-singular, relative to a sequence {¢n(Z) }, 
if none of the following determinants vanishes: * 


Lo[ $0] Lol on] 


We shall consider only non-singular methods. 


THEOREM 1. For a given function ® f(x), to each (n=0,1,- - -) there 
is a unique best approximating “ polynomial” s,(x) of order (not greater 


than) n. 


This follows from equations (2), since s,(z) has the form (1), so that the 
determinant of system (2) is precisely A, ~ 0. 
Let us form from {¢n(z)} a new sequence {®,(2)}, linearly dependent 


on }: 
(4) = Dnoho(t) +: + Onn 0. 


Clearly,’ the set of best approximating “ polynomials ” for {®,(a)} coincides 
with the set {s,(x)} already found for {¢n(z)}. We may therefore choose 
one set out of the infinite number of possible ones, to be obtained from {d¢n(z) } 
as was {®,(z)}, to represent all such. There exists a “ significant” set, 
which we shall term the basic set. 


DEFINITION. The basic set for a given sequence {¢n(x)}, relative to a 
method In, is the set {®,(x)} defined by 


(5) 


where ®,(x) has the form (4). 


In virtue of the condition A, 0, we see that ® a basic set exists and is 


unique. 


*A method 9 may be essentially singular; i.e., is singular for all sequences 
{¢,(@) }, as for example if n exists such that L,,L,,- - -,L,, are linearly dependent. 
-On the other hand, ‘mn may be in general non-singular, but a peculiar choice of {%,(@)} 
May give singularity, as for example if n exists such that %° + +s, are linearly 
dependent. 

*It is understood that f(#) and the functions ¢,(#) are within the class of func- 
tions on which the L,,’3 can operate. 

"The determinants A,, for {®,(@)} are non-vanishing, as is easily seen from (3). 

*The only point not obvious is that in ® (x), b,, ~0. But if b,, = 9, then the 


nn 


d 
i- 
n 
}. 
) 
a 

| 
he 


590 I. M. SHEFFER. 
Lemma 1. There is a sequence of constants {f,} such that 


Since s) and ®, are each multiples of ¢, and bo, 40, therefore fy can be 
found. Now consider 7, = &,— 8n-1.. We have from (2): 


Li [Tn] = 0, 0, In{ Tn] = fn, 


where we set 


(7) fn = Ln[8n(2) — Sn-1(2) ]. 


If fn = 0, then since A,+40 we have =0, thus satisfying (6). If 
0, then T,,/fn satisfies equations (5), whence by uniqueness, = 
and again (6) holds. 


Corottary 1. Method In has the permanence property: ® 


where only the constants f; (which are independent of n) depend on the func- 
tion f(x). 


Corotitary 2. The constants {fn} are given by 


(9) fn 


(n= 1,2,°° fo = L,[f]. 


For, let gn denote the determinant in the numerator of the right-hand side of 
(9). In (6), S1—Sn, and fn®, are linear combinations of the functions 
$o,* * *,n Which are (as we have observed in a footnote) linearly independent. 
Hence coefficients of ¢n on both sides of (6) must be equal. From (2) this 
coefficient in 8,— Sp, is gn/An, and from (5) this coefficient in fn is 
fn&dn+/An. Hence fn = gn/An-1, which is (9). 


Corotiary 3. The functions {®,(x)} are given by 


first equations of (5) tell us, since A,_, #9, that b,,=b,,=---= 
so that the (n + 1)-st equation of (5) is not satisfied; a contradiction. 

*From this follows the curious fact that if we choose f(a) = (a), then zero 
is the best approximating “ polynomial ” of all orders less than n. 


| 
n n 
4=1 4-0 
| 
| 
| 
| 
| 
| 


be 


If 


unc- 


de of 
tions 
dent. 
this 
is 


METHODS OF BEST APPROXIMATION. 591 


| Lo[ Lo[ on] 

n(x) 

(n=1,2,---); = $o(X)/Ad. 


(10) follows at once from (5) and the uniqueness of a basic set. 


4. The “ polynomials” s,(x) are gwen by 


Lo[bo] Lo[ on] Lo[f] 
on (2) 0 


(n == 0,1, 2,° °°). 


For, operate on the right-hand member of (11) with L; (0S1t=7), and 
subtract the resulting last row from the row with index 1, using the property 
[,{[0] 0. Then expand in terms of the elements in this row of index 1. 
There results Li[f], (1 =0,1,---+,n). That is, the right side of (11) is a 
linear combination of -, satisfying the conditions (2). But system 
(2) has a unique solution s,(z); whence (11) follows. 

Let us now return to the definition of a best approximation method. If 
we start with a set of linear operators Mn, which assign functions to functions: 
M,[u(x) ] = un(x), then by choosing a sequence of numbers {2}, we get a 
method 9n by setting Ln[u(x)] = ]}o-c,. In particular, we may 
have a» 5= a. 

An interesting subclass is that where the operators M, are obtained by 
iteration from a single one: 


M,=I= identity, M, —M, M,—M(M) 


and where we choose a» == a (which we may take as 0). The case of Widder 
finds itself in this class, with M[u(x) ] = du(a) /de. 
For any method 9 which is non-singular relative to the set {¢n(z) = 2"}, 


there will exist a unique basic set of polynomials ®,(x) an dust. If 
4=0 


| M[u] = du(z) /dz, then ®,(z) =2"/n!. Again, if M[u] 


then {,(2)} is the set of Newton polynomials: 


n! 


—=1, —2, — 
9 


n 2ero 


592 


I, M. SHEFFER. 


And in general, by means of these “ methods” we can define large classes 
of sets of polynomials. 

In particular, consider the class of orthogonal Tchebycheff polynomial 
sets. Given a function p(x), a Tchebycheff set {Tn(x)} is defined by the 
relations 


These are equivalent (except for an undetermined multiplier in 7,(z)) to 


If we now define 
M,[u(z)] p(t)(t—2)* u(t) dt, 
then 
Li[Tn(x)] = (Mi [Tn(Z) ]} 0-0 = 0, (t=0,1,---,n—1), 


and, by properly normalising In[Tn(z) ] = 1; so that T,(x) = ®,(z). 
We thus see that all orthogonal Tchebycheff polynomial sets can be defined 
by our methods of best approximation. 

More generally, the same is true for least square functions: Given the 
linearly independent set of functions {¢,(xz)}, the set {7n(x)} is to be 
defined by 


— T,(t) ]? dt = minimum, 


where 7’, (¢) ranges over all “ polynomials ” T(x) = Cnodo(x) + °** + Cnnon(Z). 
By forming suitable linear combinations ®, (2) = + + Bnndn(Z); 
bun #0, we can make {®,} a normal orthogonal set: 


1, (m=n). 


Furthermore, a method 97 can be found for which {@,} is the basic set. For, 
we have only to define 


La{u(2)] — dt; 


0, (1=0,1,---,n—1), 


then Li[On(x)] = { 1, (t—n), 


so that {®,} is a basic set. Now we can express 7',(x) as a linear combination 


| | 
q 


METHODS OF BEST APPROXIMATION. 593 


in the ®n’s: = fno®o +: fnn®n. From the minimum property 
we find that 


foi fa f(t) dt. 


Again, Li [Tn] fniLi ] fis 
and Lilf] = fis 
hence Li[Tn] = Li[f], (1=0, I,° 


i.e, equations (2) are satisfied, so that the minimizing set {Tn(x)} 1s 
identical with the best approximating set {sn(x) }. 
As a final example, consider the following class of methods: Let 


J(t)~at+at+---, (a, ~ 0) 
be a formal power series, generating the operator 


J[u(x)] = au’ (x) + au’ (x) 
Now define 


Mo[u(z)] = u(x), =J[u(z)],-- +, Mn[u(z)] 


giving the method 9: 
In[u(2)] = (Mn [u(2) 


We have already pointed out the cases J(¢) —¢, giving J[u] —du/dz, and 
J(t) =et—1, giving J[u] —u(«#+1)—u(ax). For the general J there 
will exist a set of best approximating polynomials {®,(2)}, which is of con- 
siderable interest in the study of functional equations based on the operator 
J{u(x)]. This aspect of these polynomial sets will not concern us here, but 
we wish to point out an interesting recurrence relation among the polynomials 
of the set {,(z)}. It is this: 1° 


= (2), (n = 1,2,° °°). 


Let us turn once again to the Widder case. We can write 


*° The following two particular cases are well-known: 


“|= 
dx (n—1)!’ 


n! (n—1)! 


he 
to 
red 
the 
t). 
‘or, 
| 


594 I. M. SHEFFER. 


gn 


or, on setting ®,(z) —2"/n!: 


That is, is a basic set for 2z"}, and is a basic set for some 
sequence, say {wn(x)}; and ,(2) has the above expression in terms of the 
set {,}. This is a fairly general phenomenon, as the following theorem will 
show. 

Let 9 be a given method, and {¢,(x)} a sequence relative to which In 
is non-singular. Then, as we have seen, there exists a unique basic set {®,}. 


THEOREM 2. Let {on(x)}, {on(x)} be two sequences for which In is 
non-singular, and let {®n(x)}, {Qn(x)} be their basic sets. ‘If 


(i) each wn(x) has a convergent ®,-expansion in a region R: 
oo 
(12) = (2) ; 
4=0 


(ii) the operators Lo, L,,- - - (which define MN) are term-wise applicable 
to the above expansions: 


(13) Lm[on()] = > (2) ; 


then the set {Q,(x)} has the form (convergent in R) 
(14) On (2) = (zr) + CnnsiPns1 (x) + Cn,ns2Pns2 (Z) 
We observe first that since , is a linear combination of wo, - - - , wn, it possesses 


a @,-expansion convergent in = eni®i (x). Again, condition (1i) 
i=0 


oo 
allows term-wise operation by Lm on this series: Lm[Qn] = > cniLm[%)]. 
4=0 


Now {Qn}, {®n} are basic sets, so (5) holds for them. Taking m =—0,1, 
this yields the relations 
0, = (,1,---,n—1), 
which, on recalling that Lm[®mn]—1, gives the following values for Cns: 
Cno == Cni * Cnm-1 = 0, Cnan—=1. Hence (14) holds. 


2. Convergence in the Widder case.1 We are here concerned with the 
convergence properties of Qn-series, where 


11 We have already remarked that in this section we shall lighten one of Widder'’s 
conditions. We add that the method used here is more direct than that of Widder. 


| 


le 


the 


ler’s 


METHODS OF BEST APPROXIMATION. 


(15) Qn(x) = (2"/n!) [1 + hn(x)] = (0) = 0, 
with hn(v) analytic in SR. 
THEOREM 3. Suppose constants c, N, Bn exist such that 
(i) 
uniformly in |x| SR for alln > N, with * 


(ii) lim sup B,/" S 1. 
If the serves 


converges for a single point x—€ in |x|=R, it converges uniformly and 
absolutely im every closed region lying** in |x| <|é|, thus representing 
an analytic function in |a| < | é|. 

For: 


| — | fan ()| | On (2) S + (M/c) Bw | 2/€ 


where A(z) is the sum of the absolute values of the first NV terms, and M is a 
bound (which exists) of | fnQn(é)|. can vanish only if or 
®,(€) 0. Now the theorem is vacuously true if 60; and ®,(é) ~0 
by virtue of (i). Hence we can assume that 0,(€) 40; and the indicated 
division is possible. Since ** lim sup Bn’/” S 1, the last series converges uni- 
formly and absolutely in every closed region in | z| < | €|, and this is then 
true of the original series. 

It is seen that condition (i), although applying throughout | «|Z &, 
is used only at the point = €. Now it may happen that for some points in 
|c| RF a number c (depending on the point) exists, and for other points 
it does not. This suggests strengthening Theorem 3 as follows: 


THEoREM 3’. Let & be the set of those points in |x| SR for 
which c, N, Bn exist (as functions of é) such that 


A class of ©,-series (that arose in the study of some linear differential equations) 


is the 9) ,,-series of Transactions, American Mathematical Society, vol. 35 (1933), 


pp. 184-214. 

“Incidentally, (ii) combined with c <= 8, gives lim sup f,’/" = 1. 

“It follows that if the region of convergence does not go outside of || <= R, 
then it is a circular region. 

“If condition (ii) is replaced by (ii’) lim sup 6,,/" = K, then the region for 
Which convergence can be asserted is <|&|/K. 


595 

e 
| 
is 
i) 

1, 
der. 


596 I. M. SHEFFER. 


(i) 0<c(€) S| S Bn(E), 
with 
(ii) lim sup [Bn(€) 


If series (16) converges for a single point x = é in &, tt converges uniformly 
and absolutely in every closed region lying in |x| <|é|, to an analytic 
function. 


The proof of Theorem 3 applies to 3’. 


Lemma 2. A function cannot have two Qn(x)-expansions uniformly con- 
vergent in an open region containing the point 0. 


For on subtracting we should have 0 = Xa,0Q,(), uniformly convergent in @. 
By successsive term-wise differentiations (which are permissible) at + =0, 
we find that (a, —0, n=0,1,:-°). 

Let us try to develop the function 1/(¢— 2) (¢ a parameter) in an 
Qn-series. Assume that 


By (formal) term-wise differentiation and setting x —0, we get 
0!/t L,(t) 
— Lo(t)h’(0) + 


(18) Lo(t)ho™ (0) In(t) (0) 
(t) (8% + Ln(1) 


thus determining the functions {Z,(t)}. Zn(t) is, in fact, a polynomial in 
1/t of degree n + 1. 
Define A, as the maximum of | ha(x)| in| «|S R: 


(19) | Sn, SR. 
Then 
(20) | on™ (0) | Sm 


Let r be any positive number, and let p= min(r,#). A simple application 
of (20) to (18) yields the inequalities 


| Lo(t)| | Li(t)| S (11/p?) (1 +A); 
| = (2 !/p*) (1 + do) (1 + A); 


| 

| 

| 

| | 


an 


ion 


METHODS OF BEST APPROXIMATION. 


uniform in |¢|=vr; and a straightforward induction proof gives 


Lemma 3. For all n, and uniformly in |t| =r, where r is any positive 
number and p= min(r, R), 


(21) Ln(t)| (1 + ro) (1 (1+ Ana). 


Then, for |¢|=r,|2|=R, 


If t%m is the n-th term of the series on the right, Uns1/Un = (1 + Ans) | /p |. 


| Ln(#) On (2) -> | Ln(t) [1 + hn (a) | > (II 


oo 
THEOREM 4. Consider the series > In(t)Qn(x), where {In(t)} ts given 
0 
by (18), and | hn(x)| San, SR. If limsup—K (finite), the series 
converges uniformly and absolutely in |x| S1,|t| Zr, where r is any posi- 
tive number, p= min(r, R), and | is any positive number less than p/(K +1); 
and the series represents, in this region, the function 1/(t — 72). 


The convergence properties stated follow from the preceding relations. Let 

H(z, t) be the sum of the series; it is analytic in x and tin |a|Sl,|¢| =r. 

Term-wise differentiation in 2 (which is permissible) with 7 —0 gives 


hence H coincides with 1/(¢— 72). 
Especially interesting is the case K—0O, in which case lim sup Ay 


a=0 


~ 


THEOREM 5. Jf limdsn»=0, then series (17) ts valid, converging 
uniformly and absolutely in |x| Sl, |t|2Zr, where r>0 is arbitrary, 
p=min(r, R), and | is any positive number less than p. 


THEOREM 6. If lim dn = 0, every function f(x), analytic about = 0, 
possesses an Q,-expansion. If the distance from «=O to the nearest singu- 
larity of f(x) is a, and if o—=min(a, R), this expansion is uniformly and 


absolutely convergent in |a|Sr where +r is any positive number <a, and 


the coefficients of the expansion are given by 


(22) f(t) faux), — (1/2ni) f f(t) Ln (tat, 
0 C 
* Tf lim sup = K (finite), every f(a), analytic about « = 0, has an 2, -expansion, 
but with reduced radius of convergence. 


597 
y 
0, 


598 I. M. SHEFFER. 


C being any contour around t=0 and within |t|<o. Moreover, f(z) 


has only one Q,-expansion, 


The convergence property follows at once on multiplying (17) through by 
f(t) and integrating around C,.using the Cauchy integral formula. All that 
remains is the uniqueness proof. From lim A, = 0 follows the existence of £, 
satisfying the conditions of Theorem 3. Hence if f(z) possesses an expansion 
different from (22) and convergent in at least one point z= £0, it con- 
verges uniformly in an open region containing 0. Lemma 2 now applies, 
to give us a contradiction; hence there is uniqueness. 


CoroLttary. The sets {Qn}, {Ln} are normal-orthogonal on C, any con- 
tour in |t| < R, surrounding t=0: 
_ f0 (mn), 
(1/2ni) Om (t) Ln(t) dt = 
For, Qn(z) is analytic in |x| =F and therefore possesses a unique Q,- 
expansion, the coefficient of Q,(2) being the above integral. Hence normality 


and orthogonality hold. 
Theorem 6 can be given a different form: 


7%. Let the condition lim 0 be replaced by the following 
condition: @n(z) =1-+ converges uniformly in | «| S to a function 
M(x) which is nowhere zero in |x| R. Then the conclusion of Theorem 6 
is valid with the modification that fn is now given by 


(23) fu (1/2mi) [f(t)/M(t) 


where {L*,(t)} ts defined by (18) with hn(x) replaced by gn(x), the latter 

defined by 

(24) 1 + hn(%) = M(x)[1 + gn(x)]. 

Clearly, gn(z) uniformly, |z|=R. The series f(z) = (2) is 
00 0 

identical with the series f(z)/M(x) = Sfn(a"/n!)[1+ gn(x)], and since 


{An} exists such that | gn(x)| An, lim An = 0, the second series expansion 
is valid by Theorem 6. The theorem now follows from the fact that the class 
of functions {f(x)} analytic about z = 0 coincides with the class of functions 


{f(x)/M (=) }. 
3. Extension of the Birkhoff theory. The method of this section is 


{| 

| 

| 

| 

ry 


METHODS OF BEST APPROXIMATION. 599 


adapted from the work of Birkhoff, as was mentioned.*® We start with a set 
of functions {u,(x)}, with given convergence properties, and seek to determine 
the convergence properties of a second set of functions {vn(x)}, related to 
{un(x)} only quantitatively. We shall introduce certain assumptions labelled 
Condition A, B, C; and it is to be understood that once a Condition has been 
stated, it is to hold from then on to the end of the section. 

Consider a sequence of functions {un(z)} satisfying Condition A: 


(i) wn(x) is analytic in the interior & of a rectifiable, simple closed 
curve CO, and is continuous in & + C. 


(ii) The function 1/(t—2), t a parameter, possesses a Un-expansion 
co 
(24) 1/(t—2) In(t)un(2), 


which is uniformly convergent in x and t for t on C and x on any closed point 
set in 2; and the functions {Ln(t)} are continuous on O37 


Corotuary. LHvery function f(x) that is analytic in & and continuous 
in 2 + C, has a Un(x)-expansion, uniformly convergent on every closed point 
set in &: 


(25) (26) (1/2ni) Ln( 


Corottary. If | Ln(t)un(x)| converges uniformly, t on C and x on 
any closed set in &, then series (25) converges absolutely, x in &. 


Now consider the set {vn(2)}, which is to be “close” to the set {un(z) } 
in a sense to be defined. We assume that vn(x) is analytic in & and con- 
tinuous in & + C. Suppose we have the expansion 


(27) f(2) dnon(2). 


** A number of papers have been written on subjects related to the work of Widder 
and of Birkhoff. References are to be found in Widder’s paper and also in: Walsh, 
Transactions of the American Mathematical Society, vol. 31 (1929), pp. 53-57. In this 
section we do not emphasize generality of statement in our theorems. Rather, we aim 
to secure a comprehensive body of theorems that are symmetric (i.e., interchangeable) 
in the two sets of functions {U,}> {%,}3 and that can be utilized in treating con- 
vergence of series of best approximating “ polynomials.” 

In Birkhoff’s case, u,,(@) =", so that Condition A holds when C is any circle 
with center at # = 0. 


at 
Bn 
on 

Ler 

0 

is 

ce 

on 
ass 
ns 

is 


600 I. M. SHEFFER. 


In analogy with (25, 26), we are led to consider the possibility ** of defining 
the coefficients ¢, by 


(28) bn — (1/2ni) 9(t)Ln(t)at, 
c 
g(t) to be determined. 
We see from (28) and (25, 26) that 


(29) 9 (2) dntin(2). 
Substitution of (28) into (27) yields 
(30) f(z) (1/2mi) { vn (2) Ln(t) dt. 


This integral equation is not well-adapted to determine g(t). We can obtain 
an equation of the second kind by subtracting from (30) the relation 


In fact, we then have 


(32) f(x) — g(x) + (1/2ni) K (a, t)g(t)dt 
where 
(33) K (a, t) [vn(z) —un(2) (t). 


We now assume 


Condition B. Series (33) converges uniformly for x in & + C and t 
on C, and 
| K (2, t)| < 


for x and t on C, where | = length of C. 
It is Condition B that is the test of {vn(x)} being “close” to {tun(z)}. 


CoroLuaRy. K(z,t) is analytic in x in the region & for each t on (, 
and is continuous in x and t for t on C andzin &+-C. 


We observe that the formal work from (27) to (33) is valid if we work 
backwards; i.e., given g(x), assumed to be analytic in & and continuous in 
&+C; then (28), (29), (31) hold, and (29) is uniformly convergent for 
z on any closed set in &. If now f(x) is defined by (32), then f(x) is seen 


*® Our argument is purely formal until our conclusions are stated and proved. 


METHODS OF BEST APPROXIMATION. 601 


to be analytic in & and continuous in 2 + C; and by combining (31) and 
(32), then (30) holds, the series within the brace being uniformly convergent 
in z and ¢ for ¢ on C and z on any closed set in &. (30) may then be 
integrated term-wise, yielding series (27), which is uniformly convergent for 
z on any closed set in &. 

Our aim, however, is to start with f(2) and determine g(r). In (32) 
let « be chosen on C. As x and ¢ traverse C they are functions of the arc 
length (measured from some point on (C): 


r= @(s), t= O(c). 


Our hypothesis on C assures us of the existence of d@(o)/do almost every- 
where.’® Point x being on C, let us set 


F(s) = f(@(s)), G(s) =g(@(s)), K*(s,0) (1/2nt) K(@(s), O(c) 


Equation (32) then reduces to the equivalent form 


(32’) F(s) = G(s) + K*(s,0)G(o)de 


F(s) is continuous; and so is K*(s,a) except on a set of measure zero (due 
to the possible non-existence of @’(c)), where it can be defined so as to be 
bounded for s, o in 0s, ol. The Fredholm theory can be applied. 
Because of the second part ?° of Condition B, the Neumann series for a solu- 
tion of (32’) converges uniformly, so that A = 1 is not a characteristic number. 
Hence (32’) has a unique solution G(s); and it is continuous, 0s Sl. 
This continuous function G(s) defines a continuous function g(z) 
(z on C) satisfying (32); and g(x) is a unique solution of (32). We now 
extend the definition of g(x) to & by means of (32), where z is now in &. 
g(x) is seen to be analytic in &. Suppose z, in &, approaches a point « of C. 


The functions f(z), (1/2ni) K (a, t)g(t)dt being continuous in & + C, 


it follows from (32) that g(x) —>g(a), where g(a) is the value, at x =a, 
of the unique solution of (32) for zon C. This gives us 


** And at such points where 0’(¢) fails to exist, the difference quotient is never- 
theless bounded: | |= 1. Where 8’ does exist, it has the value 0’(c) = 
where @ is the angle which the tangent to O (at the point ¢) makes with the real axis. 

*° Cf. Whittaker and Watson, Modern Analysis, 4th ed., pp. 221-222. A remark 
of Birkhoff (loc. cit.) is apropos here: It is not necessary that | K(#,t)| be less than 
2n/l. All the work of this section will hold if we merely assume that in the integral 
equations (32) and (40), \=1 is not a characteristic number. 


i 
= 


602 I. M. SHEFFER. 


Lemma 4. To every function f(x), analytic in & and continuous in 
3 + C, there corresponds a unique solution g(x) of (32); and g(z) 1s also 
analytic in & and continuous in & + C. 


Having obtained the function g(x), the observation made after Condition 
B, on going from g(x) to f(z), enables us to state 


THEorEM 8. Every function f(x), analytic in & and continuous in 
J + C, has a vn(x)-expansion 


uniformly convergent on every closed set in &. In (28), g(t) is the function 
of Lemma 4. 


Corottary. If series Un(x)Ln(t)| converges uniformly for x in 3 
and t on C, and series | [vn(x) — ]Ln(t)| converges uniformly for 
in & +0 and t on C, then series (27) (with dn given by (28)) converges 
absolutely for all x in &. 


For: We have = S¢ntn(x), = so that 


f—g—3(1/2ni) g(t)Lu(t) Ja 
By hypothesis this series converges absolutely. Again, from the second 


Corollary to Condition A, g = 3(1/2ri) f g(t) In (t) dt converges ab- 


solutely. Hence on adding, (27) converges absolutely, x in &. 
The functions {Z,(t)} are defined only on C, where they are continuous. 
We can extend their definition to €, the region exterior to C: 


DEFINITION. For z in &, Ln(z) is defined to be 


(34) In(s) dt. 


L,(z) is analytic in E, and Ln(0) =0. 


THEOREM 9. The series 


(35) 1/(2—2) ¥ In(2)un(2), 


There is no reason for supposing, without further hypotheses, that L,,, defined 
in €+C by (34) and by Condition A, is continuous in €+0. Later, when we do 
have a further condition, this assertion can be made. (See Theorem 20.) 


METHODS OF BEST APPROXIMATION. 603 


holds uniformly in x and z for x on any closed set in & and z on any “ closed” 
set in 

To show this, observe that for z in €, 1/(2—) is analytic (in z) in & and 
continuous in 2 + C, so that (Corollary to Condition A) it has a uniformly 
convergent U,(2)-expansion: 


co 

bn(2) Un (Z), $n (2) = 
co 

From (34) we see that on(z) =LIn(z). Since > un(r)Ln(t) converges uni- 
0 

formly, z on a closed set in & and ¢ on C, therefore term-wise integration 

(after multiplication by 1/(2—1)) is permissible, the resulting series being 

uniformly convergent for z and z in the regions stated. 


THEOREM 10. If f(z) ts analytic in E+ C, tt has the Ly(z)-expansion 


1 

n=0 

uniformly convergent for z on any “ closed” set in €. 


For: We can find a rectifiable simple closed curve J inside C such that f(z) 
is analytic on J and exterior to J. Then, by (35), 


the series being uniformly convergent for z on a “closed” set in €. Now 
J can be chosen as close to ( as we wish; whence it follows, from the con- 
tinuity of u,(x) and f(x) in the closed region consisting of J, C and the ring 
that they bound, that in the coefficient of Ln(z) the curve of integration J 
may be replaced by C without altering values. That is, (37) holds. 

Consider again equation (33). The series being uniformly convergent 
fort in & + C and ¢ on C, we may multiply through by 1/(t—z), z in €, 
and integrate term-wise : 


*? By a “closed” set in € we shall mean both the usual closed set and also any 
unbounded set in € (including z=), the important feature being that the set is 
at a positive distance from C. 


604 I. M. SHEFFER. 


the resulting series being uniformly convergent for x in & + C, and z on any 
“closed ” set in €. Now by (34), the parenthesis has the value Dn(z). This 
enables us to extend the definition of K to €: 


Lemma 5. The series 


(3) K (a, 2) ¥ — 


converges uniformly in x and z for « in & + C and z on any “ closed” set 
in €, so that K(x, z) is analytic in x and z for x in & and zin €. Moreover, 


K (a, t) dt, 
t—z 


(38) K(2,2) = =r J, 


where K(x, t) 1s given by series (33); and K(x, 0) =0. 
In the expansion (27), the coefficients ¢, are given in terms of g(z). 
The question arises if we can express ¢», directly in terms of f(z) : 


(39) — (1/2ni) f f(t) Mn (tat, 
where the functions M,(¢) are to be determined. Since (32) holds for 2 in 
& + C, we may substitute for f(¢) in (39) its value as given by (32). On 
further using the relation ¢, = (1/271) f g(t) Ln(t)dt, this gives 


(a) (1/2ni) g(t) K t)Mn(w) dw)at = 0. 


Now (a) is to hold for all g, and we want M, to be independent of f (and 
therefore of g). This suggests that we set the brace equal to zero: 


(40) Ln(t) + (1/2mi) t)M,(w) dw. 


This integral equation can be thrown into “real” form, as was (32). The 
resulting kernel is K**(s,0) = (1/2mi) K (@(c), @(s))@’(c), so, as was the 
case with (32), (40) has a unique solution M,(t) (¢ on C), and M,(t) is 
continuous. 


THEOREM 11. In Theorem 8, the coefficients dn can also be expressed by 
(39), where M,(t) is the wnique and continuous solution of (40), t on C. 


To see this, we observe first that (a) is satisfied. On using (28) and (32), 
(a) simplifies to 


rt 


d 


METHODS OF BEST APPROXIMATION. 605 


(1/2ni) Ma(t)[g(t) + (1/2ni) K(t, w)g(w)aw]at 
= (1/2ni) Mn(t)f(t) dt, 

which is (39). 

By means of (40), with ¢ replaced by z, we can extend the definition of Mn: 

DEFINITION. For z in &, 
(41) M.(s) Le(s) — (1/2ni) f K(t, 2) Mn(t) dt, 


M,(t) being the unique and continuous solution of (40). 


CoroLLary. M,,(z) is analytic in z, with Mn(o) and for z in &, 


(42) Mn (2) dt. 


We need only establish (42). If we multiply (40) through by [(—1)/2z1] 
X [1/(f—z) ] and integrate over C, and use (34) and (38), we get 


M,(t) 1 
Ln(2) i+ orf, K(w, z)My(w) dw. 


Comparison with (41) then yields (42). 
THEOREM 12. The expansion 


(43) 1/(z2—2) — ¥ (2)Ma(2) 


is uniformly convergent in x and z for x on any closed set in & and z on any 
“closed” set in €. 


For: In (32), choose z in €. Then g(r) = g(z, z) 
is defined by (32) to be analytic in x and z for z in & and z in €; and is 
continuous for z in 2 + C and z in €. If we multiply series (24) through 
by (1/2mi) - [1/(z—t)] and integrate term-wise around (, we observe that 
g(z,z) has a w,(«x)-expansion (cf. (29)) that is uniformly convergent in x 
and z, x on any closed set in & and z on any “ closed ” set in €: 


2) — wn (2) $n(2), = (1/2ni) Ln(t) g(t, 2) dt. 


If we now combine (31, 32, 33), where g(x) = g(z,z) (so that 
f(z) =f(x,z) =1/(z—z)), and recall that (31) converges uniformly in 


y 
§ 

0 

1€ 
is 


606 I. M. SHEFFER. 


both x and z, then (30) is seen to hold, also uniformly convergent in both 
x and z. Term-wise integration gives (27), again uniformly convergent in 


co 
wand z: 1/(2z—z) => ¢n(z)vn(xz). It remains to identify the coefficient 
0 


of v,(z). This coefficient is the coefficient ¢, given by (27) and (28); 
and by Theorem 11 this coefficient has the value given by (39) : 


Mn 
$n (2) 79 dt. 


Comparison with (42) shows that ¢n(z) —M,(z), and the theorem is 
established. 
From this follows (cf. Theorem 10) 


THEOREM 13. If f(z) is analytic in E+ C, it has the Mn(z)-expansion 


uniformly convergent for z on any “ closed” set in €. 


There is apparent, by this time, a duality between the sets {un}, {Ln} 
on the one hand and the sets {vn}, {Mn} on the other. We shall now examine 
to what extent their réles can be interchanged. If H(z, ¢) exists, having the 
relation to {vn} that K(2,t) has to {tun}, we should expect it to be given 
by the series 


For the moment we shall put aside the problem of convergence of this series. 
If we substitute (40) into (33), we get the relation *° 


(46) H(a,t)+K(a,t) + (1/2ni) w)K (w, t)dw =0. 


This is an integral equation for H(z, t), with the same kernell K(w, t) as in 
(40). Knowing the properties of K (x,t), we can therefore state the following 
properties for H(z, t): 


Lemma 6. The function H(z,t) defined by (46) is the only solution; 
it is continuous in x and t for z in & + C and t on C, and is analytic (i 2) 
for x in & and for each t on C; and is continuous in x and t for x in & +0 
and t in &, and is analytic (in x and t) for xin & and t in E. 


*° This is the well-known equation for kernel and resolvent kernel in integral 


equation theory. 


fo 


= 
= 
| 


METHODS OF BEST APPROXIMATION. 607 


THEOREM 14. K(z,t) is a resolvent kernel for the equation 


(47) g(t) = f(z) + (1/2ni) H(a, t)f (tat. 
For, in the right-hand side of (47), let f have the value given by (32). On 


simplifying we obtain + (1/2ni) f g(t) A(a, t)dt, where A(z, t) is the 
left-hand member of (46); i.e., A(a,¢) =0, so that the right-hand side of 
(47) does equal g(x), and (47) is satisfied. 

Since equation (47) has the solution (32) for every function g(x) that 
is continuous on C, it follows from the Fredholm theory that (47) always 
has a unique solution. Hence 


CoroLLary 1. For every g(x), analytic in & and continuous in & + C, 
equation (47) has a unique solution f(x). This solution is analytic in & and 
continuous in & + C. 


It is also an immediate consequence of the uniqueness of solutions of 
both (32) and (47) that 


2. is a resolvent kernel of (32); that the 
unique solution of (32) 1s furnished by (47). 


If in (47) we set f(x) =K(az,w), we find on using (46) that 
g(x) =—H(z,w). Substituting these values of f and g into (32) then 
gives us the equation which is the twin of (46): 


CoroLtuaRy 3. The functions H and K salisfy the equation 


(46) H(a,t) + K(2,t) + (1/2ni) K (a, w)H(w, t)dw =0, 
valid forz in & + C andtin€E+C. 


LemMa 7%. The unique solution M,(t) of equation (40) is given by 


(48) M(t) =In(t) + (1/2ni) f H(w, t) Ly (w) dw. 


This follows on using (46’). 


We can now establish the validity of (45). 


THEOREM 15. H(z2,t) has the expansion (45), which converges uni- 
formly for xc in & + C and t on C. 
10 


n 
n 
le 
n 
in 
6 
: 


608 I. M. SHEFFER. 


To show this, we first have series (33), uniformly convergent in the region 
stated. We therefore have (from (48) ): 


[00 (2) — tin (2) 
+ (1/2mi) fH (w, t)Ln(w) dw) 
-> [vn (2) — tn (2) 
+ (1/2ni) t){ [oa 2) — (2) dv, 


the two series on the right converging uniformly in the region stated. Hence 
this same convergence property applies to the series on the left. There remains 
only to prove that this series has the value — H(z,¢). But this follows from 


(46’) since the right-hand side is K(z,t) + (1/2ni) H(w, t)K (2, w) dw. 
Cc 
Theorem 8 can be dualized: 


THeEoREM 16. If g(x) is analytic in & and continuous in & + C, then 
we have 


(49) g(x) — (50) (1/2ni) f° f(t) 


uniformly convergent on any closed in &. Here f(x) is the solution of (4%). 
For: In (47) replace H(z,t) by its uniformly convergent expansion (45). 
There results the equation 


g(x) = f(a) + 3[(1/2ni) f(t)Mn(t) dt] [un (2) — vn(x) J, 


the series being uniformly convergent on any closed set in &. But the series 


f(x) = 3[(1/2ni) f(t) 


has the same convergence property (Theorem 11); hence so has 


3[(1/2ni) f f(t)Mn(t) dt] un (2). 
We thus have 


g(2) = f(a) + 3[(1/2mi) f(t) —f (2), 
from which (49) follows. 
We have now an almost complete duality of {un}, {Zn} and {vn}, {Mn}. 
That it is not fully complete (at least so far as has been proved) is owing to 


q 
| 
| 


METHODS OF BEST APPROXIMATION, 609 


this lack; in Condition A we are not certain that (24) holds in the region 
stated, when wm, Dn are replaced by un, Mn. What we do know, up to this 
point, is that (24) will hold (cf. (43)) if ¢ 1s im €, rather than on C’; nor 
does Theorem 8 permit ¢ to be on C. However, we can fill in the gap: 


THEOREM 17. The expansion | 
co 

(50) 1/(t— 2) = Mn(t) vn (2) 
n=0 


is valid, and is uniformly convergent in x and t for x on any closed potnt set | 
in & and t on C. i 


To show this, we begin with the expansion 


(2) 1/(t—2) = Ln(t)a(2), 


which has the convergence properties stated above. On multiplying through 
by (1/2ri)H(t,w) we may integrate term-wise, the resulting series being 
likewise uniformly convergent: 


(b) (1/2ni) H1(w, t) + Ln () doo. 


co 
Hence the series {Ln(t) + (1/2mi) f H(w, t)Ln(w)dw}tn (2) converges 
0 Cc 
uniformly in z and ¢ for x on any closed set in & and ¢ on C. But the brace 
equals M,(¢) (cf. (48) ) ; hence the series Mn(t)un(x) converges uniformly 
0 


in ¢ and ¢ in the region stated. Now (b) simplifies to 


(1/2ni) H(w, t) - [1/(w —2) ]dw = H(z, t), 
so that 


(t) tn (2) — 1/(t—2) + H (at) —=1/(t—2) +B [um (0) (2) 


The two series are uniformly convergent in x and ¢ for x on any closed set 
in & and ¢ on C. If then we subtract the first series from both members, 
we get (50), the series having the same convergence properties. This estab- 
lishes the theorem.?* 


*4There is another point concerning duality: In Condition B we have | K (a, t)| 
< 2n/l. Now we do not know that H satisfies the same condition. In fact, if we 
write max | K (a, t)| = 2m0/l, o < 1, then from (46) all we know is that max | H (a, t)| 


: 

' 


610 I. M. SHEFFER. 


We may sum up, in part, as follows: If {un} satisfies Conditions A and B, 
so does {vn}. 

The function um(z), being analytic in & and continuous in & + C, has 
a Un-expansion (cf. (25)): 


co 
If a Un-expansion is unique, then we have the biorthogonal property 


(51) (1/2ni) tum (t) Ln (t) dt = 

But Condition A does not insure uniqueness. For example, take uo(z) =1; 
Un(x) = (x"""/n!)(a—n), (n>0). It is readily shown ** that if series 
SCnUn(x) converges, the region of convergence is the interior of a circle, center 
the origin, reaching out to the nearest singularity of the function that is 
defined; and the convergence is uniform on any closed set within the circle 
of convergence. Moreover, every function, analytic about x0, has a 
Un-expansion; and the functions {Ln} can be defined by 


In(t) (it), (n>0); Lolt) =—1/t. 


Condition A is fulfilled on choosing C as any circle with center at tz =0. 


But the function zero has the uniformly convergent expansion 0 = > un(z), 
‘ 0 


so there fails to be uniqueness. 
If there is to be uniqueness, it must appear in our assumptions. We 
accordingly add the uniqueness 


Condition C. If zero has the expansion 0 = > Qntn(x), uniformly con- 
0 


vergent on every closed set in &, then a, =0, (n=0,1,- - °). 


From this follows that a function f(z) cannot have two distinct un-expansions, 
each uniformly convergent on every closed set in &.. Consequently we have 


<‘le/(1—¢)](2m/l). But the only use made of the condition | K | < 2m/I is to 
secure uniqueness of the solutions of the integral equations with kernels K(a,t) and 
K(t,a). Hence we ought to establish this same uniqueness for the kernels H (2, t) 
and H(t,«#). Now it is already known for H(a2,t) (cf. Theorem 14). That it also 
holds for H(t,#) is easily shown. 

**It being understood that in B the inequality | K| < 2m/l is replaced by the 
assumption of uniqueness of the solutions for the kernels K(a,t) and K(t,«@). 
*° Compare the proof of Theorem 3. 


4 
} 
| 
| 
4 
; 


METHODS OF BEST APPROXIMATION. 
Lemma 8. The biorthogonality relations (51) hold. 
LemMA 9. The sets {vn}, {Mn} are biorthogonal: 


Multiply equation (40) through by um(¢) and integrate over C: 


bon (1/2mi) + (1/2m)? K (w, t)Mn(w) (t) dwdt. 


If we replace K(w,t) by its uniformly convergent expansion (33), this 
reduces to 


Ban (1/2ri) (t) tum dt + (1/2mi) f [vm (w) — tum (w) (w) dw, 
JC 

and on cancelling the first and third terms (whose sum is zero), there remains 
relation (52). 

THEOREM 18. FKquations (32) and (47) are satisfied by taking 
f(t) = Um(x), = Um(2). 
To see this, let g(x) = Um(x) in (32) and replace K(z,t) by its expansion 
(33). Then, using (51), = Um(x) + [Um(x) —Um(r) ] = vm(2). 

We come now to {vn}-uniqueness: 


THEOREM 19. If the vn-expansion > Cnvn(x) converges uniformly in 
0 


2+ C (the sum function f(x) being therefore analytic in & and continuous 
in &2+C), then necessarily cy = (1/2ni) f f(t) M,(t) dt. 


Multiply the series through by M,(x) and integrate over C: 


(1/2ni) f()Ma(t)dt — (1/2ni) font) Maat, 
n=0 
By biorthogonality, the series on the right reduces to cs, thus establishing the 
theorem. 
Suppose we make a temporary translation of the complex variable z so as 
to insure that the origin is within C. This will not affect any of the results 
already obtained. Now it will be true that 


1 dt 


(52) (1/2mi) f Um(t)Mn(t) dt = 

| 

| 

) 

0 


612 I. M. SHEFFER. 


whence from the uniformly convergent expansion (24), we obtain 


at} (m = 1,2,° °°), 


n=0 
uniformly convergent for z on any closed set in &. By the uniqueness 
property, all the coefficients must vanish: 


Lf 


c 


Now for a given n, condition (a) is necessary and sufficient *’ that there exist 
a function (which will of necessity be Ln(z)), analytic in €, and such that 
as z—>t on C, In(z) > In(t). 

It follows that L,(z) is continuous on € + C, and analytic in €. If we 
multiply (33) through by (1/21) -(1/t") and integrate around C, we see 
from (a) that 


1 ( K(z,t) 
5 7 im dt = 0, (m ) 
so that K(x, z) also has the property that as z in € approaches a point ¢ on C, 


then K(x, z) > K(z,t). That is, K(z,z) is continuous for z in & + C and 
zin€+C. Finally, from (40) we get 


dt = 0, (m =1,2,° °°), 


so that M,(z) is continuous in €-+C; and therefore, also, H(2z,z). To 


sum up: *8 


THEOREM 20. The functions In(z), Mn(z), K(2,2), 2) are con- 
tinuous in « and z for in &2+C and z in E+C; and their values for 
z=t on C are respectively the known functions In(t), Mn(t), K (a, t), H(2,t). 


*7 If OC is an analytic Jordan curve, this result holds if #(t), the function given 
on the boundary (which in our case is L,,(t)) is continuous. (Walsh, Transactions 
of the American Mathematical Society, vol. 30 (1928), especially pp. 327 and 329.) 
If C is rectifiable, this same result holds if ¢(t) is merely Lebesgue integrable (in 
which case the approach holds almost everywhere and must be non-tangential). 
(Priwaloff, Comptes Rendus, vol. 178 (1924), pp- 611-614.) In the Priwaloff case 
it is not clear (although probably true) that if ¢(t) is continuous, then the approach 
holds everywhere on CO. If this is not the case, we shall regard it as assumed that 
C satisfies the Walsh condition. 

**If we now undo the translation that was made temporarily, none of these 
properties of continuity in € + C will be altered. 


it 
it 


e 
e 


49 


d 


METHODS OF BEST APPROXIMATION. 613 


In some of our theorems we had to insist on z being on any “ closed ” set 
in €, because we had not this last theorem. It will be clear that we can now 
amend some of the theorems, as follows: 


CoroLLtary. In Theorem 9, uniform convergence maintains for z in 
E+C; Theorems 12 and 1% combine to give uniform convergence for z in 
E+C; and Lemma 5 holds untformly for z in E+ C. 


If we were to consider interchanging the réles of {un} and {Zn}, or of 
{vn} and {M,}, we would define two fanctions A(z,z), B(z,x) by the series 


A(z, 2) [Mn(2) —Ln(2) ]um(2), 
(53) ving 
B(z, 2) [n(2) — Ma(2) 


These functions are closely related to H and K. In fact we have the 


CoroLtuaRy. The above series (53) converge uniform’, in z and x for z 
on any “ closed” set in E and «in &+C; and 


(54) A(z,v) = H (a, z) ; B(z,2) = K(2, 2). 


The convergence is immediate; and (54) follows from (33), (45) and (35), 
(43) (with reference to the preceding Corollary). 

This is as far as we shall carry the theory of the {wn}-, {vn}-sets. We 
now point out how the results of the present section can be applied to the 
convergence question in methods of best approximation. 


THEOREM 21. Let Un(x) = @n(Z), Un(x) =Qn(x), (n =0,1,-- -) be 
two basic sets relative to a method In of best approximation, and let ®y(z), 
n(x) be analytic in a region & and continuous in &+C, C being the 
boundary of &. We further assume: 


(i) Conditions A and B hold. 


(ii) If*° {hn(x)} is any sequence of functions analytic in & and con- 
tinuous in & + C, then the operators {Li} that define the method M™ are 
term-wise applicable to every series SCyln(x) that converges uniformly on 
every closed point set in &. 


actually make use of (ii) only for the sequences h,(@) =®, 2, (a). 
The following observation is of interest: By the Corollary to Condition A, each @, (a) 
has a ®, (#)-expansion uniformly convergent on any closed point set in Q. Therefore 
((ii) and Theorem 2) 


(14) 2, = (x) Cn ngs Cn nso? nse 


From (14) it is seen that Condition B will certainly be fulfilled if the coefficients 
chosen sufficiently small. 


n=0 
or 
en 
ns 
.) 
in 
1). 
ge 
ch 
at 


614 I. M. SHEFFER. 


Under these conditions, if f(x) is any function analytic in & and con- 
tinuous in & + C, the approximating “ polynomials” sn(x) of f(x) (relative 
to the basic set {Qn(x)}) converge uniformly to f(x) on every closed point 
set in &. 


For: By Theorem 8, f(z) has an 2,()-expansion, uniformly convergent on 
CO 

any closed set in 2: (a) f(z) => F,O,(x). On the other hand, if s,(2) 
0 


is the best “ polynomial” of n-th order, then (using (6) with ©, replaced 
by Qn), 

(b) foMo(x), — = fnQn (2), 

(c) fn — Ln[8n(2) —8u-1(2)]. 
Also ((2)), 

(d) Li[si(x)] = Lilf(x)]. 


The theorem will be established if we show that Fn = fn, (n =0,1,---), 


since (cf. (8)) 8,(z) = fii (2). By (ii) we may operate *° with Z; on (a): 
i=0 


(e) Li[f] = Poli [Qo] + Fili(Q.] + 
Taking i 0: PoLo[Qo] ; Lo[8o] FoLo[Qo] ; therefore 
folo[Qo] = FoLo[Qo], and fo— Fo since Now assume that 
F,=f,, (r=0,1,--+,t1—1); we shall complete the induction for i: 
(e) reduces to 


La[f] = + + + Fi; 
Li[si] Li[so] + Li[si— 80] Li[8i-1 — 8i-2] + Fi; 
Lifsi] = Lilsi.] + Fi; Li[si — = Fi; 
therefore f; = F;. This completes the proof.* 
A comparison of (8) with (28) and (39) yields the 


Corottary. The coefficients in the Qy-expansion for f(x) are given 
(variously) by 


(55) g(t)Ln(t)dt— — ff (t)Ma(t)dt = 


PENNSYLVANIA STATE COLLEGE. 


8° We also use (5): L[Q,] =0,i <n. 
** It is worth noting that in this theorem we do not assume the uniqueness Condi- 
tion C. A possible choice of v,(@) is v, =u, =, Hence, if we omit the 
details that make Theorem 21 precise, the sense of the theorem is contained in the 
statement: If {®,}; {2,,} are two basic sets “ sufficiently close” to each other, they 
give essentially the same convergence properties to the respective best approximating 
“ polynomials.” 


GROUPS CONTAINING FIVE AND ONLY FIVE SQUARES. 


By G. A. MILLER. 


Let G represent a group such that the squares of its operators are five 
and only five distinct operators including the identity. When these five squares 
constitute a group it results that there are two and only two such groups 
which are not direct products. This is a special case of a general theorem 
relating to groups whose squares constitute a cyclic subgroup.’ When these 
squares do not constitute a group there are three possible cases, as follows: 
Two of them are of order 4 and two of order 2, two are of order 3 and two 
of order 2, or four of them are of order 2. In each of these cases the identity 
constitutes the fifth square. This is always found among the squares since 
it is the square of itself. 

When two of the squares are of order 4 then the squares generate the 
abelian group of order 8 and of type (2,1) which includes two operators 
of order 4 and ove of order 2 which are non-squares. These 8 operators con- 
stitute an invariant subgroup of G which corresponds to an abelian quotient 
group of order 2” and of type (1,1,1,-:--). This invariant subgroup appears 
in an invariant abelian subgroup of order 16 and of type (3,1). Since G 
includes operators of order 4 whose common square is not equal to the square 
of the operators of order 4 which appear in this subgroup of order 16 the 
latter operators are not commutative with the former and they give rise to 
commutators of order 4 with respect to the operators of order 8 in the given 
subgroup of type (3,1). As these automorphisms are of order 2 these com- 
mutators are transformed into their inverses under G, 

For the sake of brevity in the statements it will be assumed in what 
follows that G is not the direct product of a group containing five and only 
five operators which are squares and of an abelian group of order 2” and of 
type (1,1,1,---). The order of G cannot be less than 32 and when it is 
of this order its central is the four group contained in the given invariant 
abelian subgroup of type (3,1). There is one such G in which each of the 
Operators of this invariant subgroup is transformed into its inverse and all 
of the remaining operators are of order 4 and have a common square which is 
distinct from the square of the operators of order 4 contained in the given 


*G. A. Miller, Proceedings of the National Academy of Sciences, vol. 20 (1934), 
pp. 203-206, 
615 


n- 
ve 
nt 
on 
r) 
ed 
re 
at 
1: 
en 
(2)]. 
di- 
the 
the 
} ey 
ing 


616 G. A. MILLER. 


subgroup of order 16. There is no such G in which each of the operators 
of this subgroup is transformed into its third power but there are two such 
G’s in which the commutator subgroup is the cyclic group of order 4 which 
is not generated by an operator of order 8 contained in G. In one of these 
two groups 8 of the additional operators are of order 2 while in the other 
all of the additional operators are of order 4 but have two distinct squares, 
Hence there results the following theorem: Hvery group which has the 
property that it contains five and only five operators which are squares, 
including such an operator of order 4, involves at least one of the three 
groups of order 32 which have this property. 

To determine all the possible groups of order 64 which contain five and 
only five operators which are squares including one of order 4 it is therefore 
only necessary to extend each of the three groups noted in the preceding 
theorem by 32 additional operators. These include 16 operators which are 
commutative with an operator of order 4 which is a square and hence each 
possible set of 32 additional operators includes an operator of order 2 which 
is commutative with this operator of order 4. To the first of the three given 
groups of order 32 we can adjoin three such sets of 32 operators and thus 
obtain three G’s of order 64. In two of these the given added operator of 
order 2 has only two conjugates under G while in the third it has four such 
conjugates. To each of the other two given groups of order 32 we can adjoin 
only one such set of 32 operators. As all of these groups are distinct, there 
results the theorem that there are five and only fwe groups of order 64 which 
separately have the property that they imvolve five and only five squares 
including such an operator of order 4. 

If such a G@ of order 128 exists it contains a subgroup of order 64 com- 
posed of all of its operators which are commutative with an operator ¢, of 
order 4 which is a square under @ and all of whose operators of order 4 have 
a common square. Suppose first that this subgroup is abelian. An operator 
t, of order 4 in G@ whose square is different from ¢,? is then commutative 
with at least 8 of the operators of this subgroup of order 64 and all of these 
operators besides the identity are of order 2. Hence G involves an operator 
of order 2 which is not contained in the subgroup generated by its squares 
but is commutative with all of its operators. It is therefore a direct product 
of an abelian group of order 2” and of type (1,1,1,:--) and of a group 
which involves five and only five operators which are squares thereunder. 
Since such direct products have been excluded it results that the given sub- 
group of order 64 cannot be abelian. 

Its commutator subgroup cannot include an operator of order 4 since 


| 
| 
| 
| 


GROUPS CONTAINING FIVE AND ONLY FIVE SQUARES. 617 


such an operator would be either ¢, or ¢,* and hence it could not arise from 
an operator of order 4 or from an operator of order 2 contained in this sub- 
group. The operators of these two orders contained in this subgroup therefore 
generate a characteristic subgroup of order 32 under G. As each of the co-sets 
of this subgroup with respect to the subgroup formed by the squares under G 
involves 4 operators of order 2 and such an operator of order 2 cannot be 
transformed under this subgroup into itself multiplied by ¢,” it results that 
the commutator subgroup of this group of order 64 is generated by ¢,7. Its 
central is of order 16 and either of type (2,1,1) or of type (3,1). In the 
former case it is easy to verify that no @ can exist while in the latter case 
there is one such G. Hence there are nine groups which have the property 
that each of them contains five and only five operators which are squares 
thereunder including at least one of order 4. Three of these are of order 32, 
five are of order 64, and one 1s of order 128. 

Suppose that the five operators which are squares under @ include an 
operator of order 3 and hence two such operators. Since all the operators 
which are squares under G@ are relatively commutative * it results that such 
a G involves two and only two operators of order 2 which are squares and 
that each of the operators of order 4 in G transforms its operators of order 3 
into their inverses. Hence such a @ contains a subgroup of index 2 which 
is the direct product of its subgroup of order 3 and an abelian group of order 
2” and of type (1,1,1,---). Each of its remaining operators is of order 4 
since its operators of order 4 have two and only two distinct squares and 
every two operators of order 4 in G which have distinct squares are non- 
commutative. It therefore results that the commutator subgroup of G is the 
cyclic group of order 6 and that there is one and only one group which satis- 
fies the condition that it has five and only five operators which are squares 
including an operator of order 3. The order of this group is 48 and it con- 
tains the direct product of the group of order 3 and the abelian group of 
order 8 and of type (1,1,1). 

It remains to consider the possible cases when the five operators which 
are squares include exactly four (81, 82, 83,8,) which are of order 2. These 
four operators generate an abelian group whose order is either 8 or 16. We 
shall first prove that this order cannot be 16. If s;, S2, 83, 8, would generate 
a group H of order 16 then H would appear in the central of G for reasons 
which follow. Such an operator s, could not be non-commutative with another 
operator s, of order 2 contained in G@ for if it were s, and s, would generate 


*Ibid., vol. 19 (1933), pp. 1054-1057. 


rs 
h 
se 
er 
Le 
8, 
id 
re i 
re 
h 
h 
n 
1s 
f 
h 
n 
h 
e 
r 
t 
> 


618 G. A. MILLER. 


the octic group which would involve two of the three operators 82, 83, 8 since 
the conjugate of an operator which is a square has the same property. As 
s, and these two operators would generate the four group the four operators 
81, 82, 83, 8, could not then generate a group of order 16. 

If one of these four operators s, would be non-commutative with an 
operator ¢, of order 4 contained in G, it may be assumed that 1,” = ss, and 
that ¢, transforms s, and s,; into each other and is therefore commutative 
with s.s;. The group of order 8 generated by 81, 82, 8; would be invariant 
under ¢, and (t,s,)* would equal s,s2s,, which is impossible if s;, 82, 83, 84 
generate a group of order 16. It therefore follows that if H were of order 16 
it would appear in the central of G and every two operators of order 4 con- 
tained in G which have different squares would be non-commutative. If such 
a G exists we may assume without loss of generality that t.? = 82, 
t,?=—=s,, and t,2—s,. It results directly that ¢, transforms ¢, into itself 
multiplied by one of the following five operators 8,, 82, 8:82, 818283, 818284 since 
H includes the commutator subgroup of G. We shall first prove that the 
fourth of these cases is impossible and hence the fifth is also impossible. 

In the fourth case ¢, and ¢, together with H generate a group of order 64 
and t,t, may be assumed to be ¢;. The operator ¢, transforms each of the 
operators ¢,, ¢2, t,t, into itself multiplied respectively by one of the following 
Operators: 81, 84, 8184, 815482, 8184833 S4, 8384, SoS481, 8284833 83, S4y $354, 
838481, 838482. This is impossible because ¢, transforms ¢,¢, into itself multi- 
plied by the product of its two commutators with ?¢, and ¢,. It therefore 
results that ¢, transforms ¢, into itself multiplied by one of the following 
three operators ; s;, 82, 8:82. If we can prove that the first of these is impossible 
it will also prove that the second is impossible. Hence we assume that the 
first condition is satisfied until we arrive at a contradiction. 

The group of order 64 generated by H, ¢,, t. then involves only operators 
of order 4 in addition to H. The operator t; gives rise to the following 
commutators with respect to ¢,, tz, t,t, respectively: 1, $3, S25 S35 8283; 
82, 83, 8283. Hence there is only one such subgroup of order 128 possible. 
In this ¢, gives rise to the following commutators 83, 82, 828, with respect to 
t,t, t,t. respectively. Hence ¢, gives rise with respect to ¢,, t2, tite, ts, tits, 
tots, t,t2t; respectively to the following commutators: 81, 84, 81843 S82, S4, $284} 
82, 84, S284; 83, 84, 83845 81, S4y $1845 S83, Say 83843 1, 8184, S084, As these 
are obviously inconsistent such a subgroup of order 128 cannot appear in G. 
The existence of such a @ therefore implies that ¢, gives rise to the com- 
mutators s,s, with respect to ¢, and hence it gives rise to the commutators 
S283, S838, With respect to ts and ¢, respectively. As this is impossible, since 


GROUPS CONTAINING FIVE AND ONLY FIVE SQUARES. 619 


it would give rise to too many squares, it has been proved that when a group 
involves five and only five operators which are squares and four of them are 
of order 2 then these four operators generate an invariant subgroup of order 8. 

If an operator of order 2 in G would not be commutative with every 
operator of this invariant subgroup H then it would be non-commutative with 
one of its operators s,; which is a square and it and s,; would generate an octic 
group involving three operators which are squares. This operator would 
therefore be commutative with exactly half of the operators of H since it is 
commutative with the remaining square of order 2. Hence at least one of the 
squares of G would be invariant under @ since the co-sets with respect to 
H are invariant and therefore two of these squares would have this property. 
An operator whose square is a non-invariant operator of G would therefore 
be commutative with every operator of H and hence all the operators of the 
co-set with respect to H to which it belongs have the same square. This 
square is therefore invariant under G, which is contrary to the hypothesis. 
That is, we arrived at a contradiction by assuming that an operator of order 2 
in G is not commutative with every operator of H. If an operator of order 4 
in G were non-commutative with a square it would transform exactly two 
squares among themselves. Since no operator could be non-commutative with 
all of the four squares these squares could not be transformed under ¢@ 
according to a group of degree 4. It has been noted that only two squares 
could not be non-invariant. Hence, it results that H is in the central of G. 

It is easy to see that H is the central of G@ since this central cannot 
contain an operator of order 4 and if it would contain an operator of order 2 
which does not appear in H then G would be a direct product. Each of the 
possible groups appears in one and only one of the following three categories: 
The first is composed of those in which the product of every two distinct 
squares of order 2 is a non-square, the second of those in which at least two 
operators of order 4 which have different squares are commutative, the third 
of those in which the product of two squares is a square but no two operators 
of order 4 which have different squares are commutative. In the first case 
it may be assumed that the four squares of order 2 are 51, 82, 83, 818283. In 
each of the other two cases it may be assumed that the squares of order 2 are 
81, 82, 8:82, 83. The smallest order of a group which satisfies the conditions 
under consideration is 64. We proceed to determine all the groups of this 
order which belong to these three categories in the given order. 

We shall first prove that each of these groups contains a definite sub- 
group of order 32. To prove this we first extend H by an operator ¢t, of 
order 4 whose square is s, and thus obtain an abelian subgroup of type (2, 1,1). 


e 
8 
n 
d 
e 
t 
4 


620 G. A. MILLER. 


This subgroup is then extended by ¢. whose square is s2 so as to obtain a 
subgroup of order 32. We proceed to prove that this can always be so selected 
that t,t, is of order 2 and hence it is completely determined. If t,t, were of 
order 4 its square may be assumed to be different from s;. Hence ¢; whose 
square is s, would have to transform this subgroup of order 32 so as to give 
rise to four commutators including at least one of the form s,s,. An operator 
of order 4 with respect to which ¢, gives rise to this commutator and ft; have 
a product of order 2 since the square of such a product could not be of the 
form s,82. This proves the following theorem: Jf a group involves five and 
only five operators which are squares including four of order 2 and if the 
product of no two distinct ones of these operators of order 2 is a square then 
the group contains a subgroup of order 32 generated by these squares and two 
operators of order 4 having distinct squares and a product of order 2. 

It may now be assumed that all the groups of order 64 belonging to the 
first of the three categories under consideration contain this subgroup of order 
32 generated by H, t,, t2 where t,t, is of order 2. There is obviously one and 
only one such group in which the product of every two operators of order 4 
which have different squares is of order 2. There is also one and only one such 
group in which ¢,¢; is of order 2 but ¢2t; is not of this order. To prove that 
in each of the remaining groups of this category each of the additional opera- 
tors is of order 4 it is only necessary to note that t,t.t; could not be of order 2 
in some one of them. This results from the fact that we may assume that 
tot, is not of order 2 since we would otherwise get a group which is conjugate 
with the one already considered, and hence no commutator of the form 5,52 
could arise from ¢;. Each of these remaining groups therefore contains three 
pairs of operators of order 4 whose products are of order 2 and which are 
distinct modulo H and have distinct squares. In particular, each of these 
remaining groups involves three conjugate subgroups of order 32 which have 
the abelian subgroup of type (1,1,1,1) in common. 

Since an operator whose square is 8,828, appears among those which are 
added to the given group of order 32 it may be assumed without loss of 
generality that ¢, transforms ¢, into itself multiplied by s, and that it trans- 
forms ?,¢, into itself multiplied by one of the following four operators: 
1, $182, 8:83, 8283. Hence there are two additional such groups of order 64. 
In one of these the commutator subgroup is of order 4 while in the other 
it is of order 8. It therefore results that there are four and only four groups 
of order 64 which separately satisfy the condition that they contain five and 
only five operators which are squares, including four of order 2, and that the 
product of no two squares is a square. 


GROUPS CONTAINING FIVE AND ONLY FIVE SQUARES. 621 


The second category of groups under consideration contains by hypothesis 
the abelian group of type (2,2,1). To extend this so as to obtain a G of 
order 64 it is necessary to add thereto an operator ¢, of order 4 whose square s, 
is the fourth square of order 2 in G. Since ¢, is not commutative with any 
operator of order 4 contained in this subgroup it must give rise to four 
distinct commutators with respect thereto and hence it transforms into its 
inverse at least one of these operators of order 4. Since H contains three 
subgroups of order 4 which include the square of the operator of order 4 
which is transformed into its inverse by ¢t, and one of these subgroups corre- 
sponds to two possible G’s there results the following theorem: Four and 
only four groups of order 64 have the property that each of them contains 
five operators which are squares thereunder and contains the abelian group of 
type (2, 2). 

It remains to determine the groups of order 64 which separately satisfy 
the condition that no two of their operators of order 4 which have different 
squares are commutative but that the product of the squares of two such 
operators is one of the four squares of order 2 contained therein. We may 
assume that ¢, and f. are non-commutative and that ¢,2 s,s... We shall first 
consider the case when ¢,¢, is of order 2 and extend the subgroup of order 32 
generated by H, t,, t2 by t, so as to obtain one of the groups of order 64 which 
satisfies the given condition. It is easy to verify that ¢, cannot transform 
one of the three operators t,, ts, t,t. into itself multiplied by s3. It can also 
not transform /,/. into itself multiplied by one of the following operators: 
8182, 883, 1, S,, 8. since its product with one of the three operators ¢,, ts, tite 
has s, for its square. If t,t, is transformed into itself multiplied by s,s. the 
group is completely determined. The commutator subgroup of this group 
is of order 4. 

When ¢,¢, is transformed into itself multiplied by s1s.s; the commutator 
subgroup is of order 8. There is one and only one such group and hence there 
are two possible groups of order 64 in which ¢,f, is of order 2. When ?,t, is 
of order 4 the square of their product may be one of the two operators s,, 82 
or it may be s,. In the former case we may assume witnout loss of generality 
that the square of ¢,t, is s.. We do not need to consider the case when ¢, 
transforms ¢,¢, into itself multiplied by s, since G would then contain two 
' operators of order 4 having different squares whose product would be of order 
2, As before ¢, could not transform one of the operators f,, to, t,t, into itself 
multiplied by s;. Moreover, ¢, could not transform t,t, into itself multiplied 
by one of the following operators: 1, S2, 8,883, $283. If it transforms it into 
itself multiplied by s,s. then @ is completely determined and involves only 


d 
of 
le 
d 
n 
€ 
T 
d 
4 
h 
t 
2 
t 
e 
f 


622 G. A. MILLER. 


operators of order 4 in addition to H. Hence there are seven and only seven 
groups of order 64 which separately satisfy the conditions that each contains 
five and only five operators which are squares, four being of order 2, and that 
no two operators of order 4 which have different squares are commutative bul 
the product of two squares is a square. 

There is no upper limit for the orders of the remaining possible groups 
since every such group can be extended so as to obtain a group whose order is 
four times the order of the given group provided this group contains a sub- 
group of index 2 such that each of the remaining operators is of order 4. 
It is easy to verify that this condition is satisfied by groups in each of the 
three given categories and that when it is satisfied we may use the direct 
product of this group and a group of order 2 and adjoin to it an operator 
of order 4 which is commutative with an operator of order 4 not contained 
in the given subgroup of index 2, has the same square as the latter operator, 
and transforms this subgroup in the same manner as the given operator of 
order 4 transforms it. The resulting group can then be used in the same way 
to construct such a group of four times its own order and this process can 
be repeated indefinitely. Each of the infinite systems of groups thus obtained 
contains exactly five operators which are squares thereunder and four of these 
operators are of order 2. 


{ 
( 
a 
fo 
th 
Be 
give 
| for 
the 


CORRECTION AND ADDITION TO “COMPLEMENTS OF 
POTENTIAL THEORY.” 


By GrirrituH C. Evans. 


Dr. F. G. Dressel has called my attention by means of an example to the 
necessity of a correction for Lemma II, p. 217, in the above mentioned memoir. 
In fact, the theorem of Daniell, quoted in the lemma, does not apply. The 
lemma should read as follows: 


Lemma II. Let f(x) be bounded and measurable in the Borel sense, 
of bounded variation and or g2(x) continuous, 
then g(x) = 18s of bounded variation, and 


(1) (x) = gu(x)dga(a) + 


In the proof of the lemma omit (c), line 7, p. 218, replacing it by (d), 
and replace line 22, p. 218, by the inequality 


tm(b) —tm(a) < t(b) —t(a), 


for the total variation functions. Let g2.(z), say, be continuous. There is 
then no need of gom(x), and the proof, in the case of f(z) continuous, ends 
with line 3, p. 219. The extension to f(x), bounded and measurable in the 
Borel sense, is the same as before. 

We note also the following: 


Lemma II’. Let g:(x), go(x) be of bounded variation, 


t, the total variation of g,, and N, the upper bound of | g2(x)| overaSaSb. 
Then 


| S| 9() —9(a)| + 


1American Journal of Mathematics, vol. 54 (1932), pp. 213-234. The example 
given by Dr. Dressel is the following: 
f(a) =1, 
=I, le@w=2, 
fr which the left-hand member of (1.1) has the value 1 and the right-hand member 
the value 0. 


11 623 


Pil 

at 

ps 

is 

4, 

he 

act 

tor 

ed 

or, 

of 

1ed 


624 GRIFFITH C. EVANS. 


In fact, let gin(x) be the continuous polygonal approximation to g,(z), 
of Lemma II. Then, gin(b)g2(b) — gin(@)g2(a) =g(b) — g(a), and 


f —= g(b)—g(a) — ff dgin(2) 


S| g() —g(a)| + {tm(b) — tin (a) 


S | 9(b) —g(4)| + 
But 


b b 


so that the inequality is established. 

In the proof of Lemma IV, p. 222, Lemma II’ should be cited for the 
inequality of line 11, p. 224, without making use of the previous equation. 

A general theorem of the type of Lemma II is the following one. 


THEOREM. Let f(x) be bounded and measurable Borel, and gi(x), g2(2) 
of bounded variation, a= and let e2 be the sets respectively of 
values of x corresponding to the points of discontinuity of gi(£), g2(z). 
Then (1.1) is valid provided e, and ez have no points in common. 


There is evidently no loss in generality in assuming gi(2), 92(z) to be 
not negative and monotone-increasing. The function g(a) will then be of 
the same sort. Let f(x) be continuous, and write 


(0 = 1, 2), 


gi(%) = + Bi(z), 


where a;(x) is continuous and £;(x%) is the corresponding “ function of 
discontinuities.” We have 


and by Lemma II, the identity (1.1) may be applied to every integral except 
the last. By proving that it applies to the last integral, the identity will be 
established for f(x) continuous, and may then be extended to f(x) bounded 
and measurable Borel, as before. 

Accordingly it remains to prove that 


assuming f(z) to be continuous. 


| 
| 


CORRECTION AND ADDITION TO “ COMPLEMENTS OF POTENTIAL THEORY.” 625 


Consider first the case where £; and £2 are step functions with merely 
a finite number of jumps B;, C; at values x —b;, xc; respectively, with 
b; ~ c; for alli, 7. Then, evidently, 


fd (BsB>) — Saf (bs) Bo(bs) Bs + (cs) Bu (es) 
— ffx) + 


where the integrals of the right-hand member are general (Daniell) integrals. 

Let now f£:(xz) and £2(x) be arbitrary functions of discontinuities, so 
that they may have a denumerable infinity of finite jumps. We define Bin(x) 
as a step function, approximating to £i(x), with merely a finite number of 
jumps. In fact, let b,,b2,- - -, bx, be the values of x at which the jump 
of B(x) is =1/n. It may happen that a or b isa b;. The function Bin(z) 
is to have discontinuities only at b,,- - -, bn, and at these points it is to have 
the same discontinuities as B,(2), viz., 


Bin(@) = Bi (a) 
Bin( bi) — Bin( 0) = Bi (bi) — + O). 


It is clear that 
(1) lim Bin(Z) = B:(2), 
n=00 


For on the one hand, Bin(z) Si(z). And on the other hand, given e > 0, 
we can find n so that the sum of all the finite jumps which are each in value 
< 1/n, will be < «. Consequently for n sufficiently large, Bin(x) > Bi(z) —e, 
=b. Moreover, since Bin(%2) — Bin(21) is merely the sum of discon- 
tinuities of 8,(x) belonging to those finite jumps of B,(x), in the interval 
@, = 2S 2, each of which is in value = 1/n, we have 


(II) Bin(Z2) — Bin(21) S Bi(%2) — 


We define similarly functions Bomn(x) approximating to B.(x), and similar 
properties (I), (II) hold for the functions Bmn(z) = Bin(x) * Bom(x). In fact, 


Bin (£2) Bom — Bin (21) Bom (21) 
= {Bin(%2) — Bin(21) } Bom(%2) + Bin(21) {Bom (22) — Bom(21) } 
{Bi — Bi (#1) }B2(2) + Bi(21) — 
= B2(%2) — (41) B2(41). 


e 

) 
of 
). 
pe 
of 
of 
), 
ot 
pe 
od 


626 GRIFFITH C. EVANS. 


The hypotheses (I), (II) are the hypotheses of Daniell’s theorem,? whence 


But also 
lim f° — f° f(x) 


as an elementary property of the general integral. Since e,, e¢. have no common 
elements, there are no common points of discontinuity of Bin, Bom, and 


By successive passages to the limit, the identity is established. 


UNIVERSITY OF CALIFORNIA, 
BERKELEY, CALIFORNIA. 


*P. J. Daniell, “ Further properties of the general integral,” Annals of Mathe 
matics, vol. 21 (1920), pp. 203-220. See p. 218. 


| 

i 

| 

| 

| 
| 

| 


ON A CERTAIN CLASS OF ORTHOGONAL POLYNOMIALS. 


By A. TARTLeER. 


Introduction. Let y(a) denote a bounded non-decreasing function— 
n “characteristic function ”—with infinitely many points of increase on the 
finite or infinite interval (a,b) and such that 


b 
all “moments” «a; cidy(x) exist with a > 


The object of this paper is to study the system of orthogonal and normal 
polynomials 


(1) Un (a; dp) == Un == Ay (a" — +) = n(2; dp) = n(2) 


corresponding to the more general characteristic function of bounded variation 


(1 = 1, 2,° 


and their relation to the system of polynomials 


(2) = dy) = Ant" + (dn >0; n=0,1,---) 


having the fundamental property 
th 
(3) — Bn (m,n—=0,1,- °°). 


(8) is equivalent to 


where G,(x) — > giz‘ here and hereafter stands for an arbitrary polynomial 
i=0 


*We assume the non-existence of numbers c, d such that 


= f =o. (a<o,d <b). 
a d 


e 

(n=0,1,- > 0) 

627 


628 A. TARTLER. 


of degree =s. Our main purpose is to show how far the known properties 
of the system (2) are extendable to (1). The results obtained are an extension 
of those announced, without proof, by J. Shohat.? 


1. Some needed properties of orthogonal polynomials. 


(i) 


(L) = Cn) (Z) —AnPn-2(L) 5 An =O 


ag? > 
(5) (2) dy (2) (n=2); 
4=1 


(ii) The polynomials ®,(z) are the denominators of the successive con- 
vergents of the “ associated ” continued fraction * 


(6) ay(y) 


(iii) If °°, denote the zeros of ®,(x), then 


(7) @< Liner < < << Van << < < < b. 


(iv) Darboux’ formula,* which are of fundamental importance in the 
discussion which follows 


[ Kn (2, t3 dy) == Kn (2, t) = 
(2) Gn (t) — (t) bn (2) 


(8) 
Ky (2, 23 dy) = Kn (2) = 
Qn 


On+1 


(v) Let {nim} {fin} denote respectively the zeros 
of dys), dpe), with 


(9) < < < < Lon < < Lon < 


* Jacques Chokhate (J. Shohat), “ Sur les fractions continues algébriques,” Comptes 
Rendus, vol. 191 (1930), p. 474. 

*Q. Perron, Die Lehre von den Kettenbriichen, Teubner, 1913, p. 377. 

* Darboux, “ Mémoire sur l’approximation des fonctions de trés grands nombres,” 
Journal de Mathématiques (3), vol. 4 (1878), pp. 5-56, 377-416. 


| 
| 


ON A CERTAIN CLASS OF ORTHOGONAL POLYNOMIALS. 629 


2. Existence of the system of orthogonal polynomials Un(x; dp). The 
fundamental problem is to derive conditions assuring the existence of a 
sequence of polynomials Un(x) defined in (1), satisfying either one of the 
equivalent conditions of orthogonality : 


(10) (n= 1, 2,° 
a) dy(z) = 0 (nm; 


n+1 
Using (3, 10), we get, writing — a)Un(z) = > Aidi(z): 


i=0 
(11) (a — a)Un(2) = (2) + 
On. 


If 0, An is uniquely determined. If ¢n(a) —0, then necessarily 
onsi(%) £0 (see (7)) and A» does not exist. This, combined with Darboux’ 
formula (8), leads to 


THEOREM I. A necessary and sufficient condition that Un(x) satisfying 
(10) eatst for a given n, is: gn(a) Un(ax) is then uniquely determined: 


pn(%) — Kn(a, «) 
KE, 
Angn(%) 


Jf Un(x) exist, then Un(a) = 0. 


If dnii(%) = 0, then by (11) An = 0 and 
(18), (18°) a(x) (2) 0; 


t.¢., the degree of the arbitrary polynomial Gn(x) being here as high as 
n— the degree of Un(x). Conversely, by (12, 3), if a polynomial 
=a" satisfies (13’), then necessarily —0. As an 
immediate consequence of Theorem I we state the important 


THEOREM IJ. ¢,(a) 40 for n2=1° implies the existence of a set of 


orthogonal polynomials U,(«) =2"+--- of all degrees (n=0,1,: -) 


satisfying (10) and uniquely determined by means of (12). 


Hereafter we assume on(%)4 0 (n =1) unless explicitly stated otherwise. 


‘Infinitely many such a exist in any subinterval of (a, b). 


” 


630 A. TARTLER. 


3. Normalization of the system Un(x). By virtue of (8, 3) 


Kola, a) Gn(x) (%) Jn 


On+1 


Take here G,(z) =K,(z,%) and use (12): 


(14) —— (pp +15 > 0), 


dnsi(a) | 1 
Gn? || pn (a) | 


. (%) 
15 wre t) = — sgn | — 
(15) ta? (a) — pa — sgn | — 
Thus the integral in (15) is positive or negative, contrary to what we shall 
call the “ordinary” case, i.e., that of a monotonic characteristic function. 
Turning to (7), we get at once 


for @< OF Ten < 


(16) 


4, The recurrence relation for Un(x). Write 
(17) Un(x) = (@— En) + Pa-2(2) (En = const.), 


where Pn-2(x) is a polynomial of degree Sn—2. Making use of (10), 
we get at once: 


b 


The degree of P»2(x) cannot be less than n—3. For otherwise, we could 
take in (18) Gn-s(2) == (@—a)Pn2(x) and thus render the integrand non- 
negative. P»n2(xz) cannot be of degree n —3 for then (18) would be equiva- 
lent to (13’), which in turn implies ¢n_.(@) 0, contrary to our assump- 
tion (§2). Hence, Pno(x) is actually of degree n—2. Moreover, (18) 
being nothing but the condition of orthogonality (10), Pn-2(x) differs from 
Un-2(x) by a constant factor only, so that (17) becomes 


(19)  Un(x) = —AnUn-2(z) En, An = const.). 


We thus obtain for {Un(x)} a recurrence relation precisely of the same typ 
as (5). (19) yields through (10, 14) by comparing coefficients: 


| 


ON A CERTAIN CLASS OF ORTHOGONAL POLYNOMIALS. 


(<r) 


a 


an-1 


(n=2), 


== Pn-1Pn-2 


Cis Cn = Sn — Sn-1- 
i=1 

It follows that in the case under consideration An are not all positive, contrary 
to the ordinary case. 

Introduce, as in the ordinary case (Perron, I. c.), the “ associated ” power 
series and continued fraction 

21 ~ ~ — 
(21) a t—Y /42(2) 
The n-th convergent of the latter we denote by Pn(z)/Qn(x), Qn(z) being of 
degree yn (n= 0). Then, its fundamental property 


leads to the orthogonality property for Qn(z) : 


(qi(x)-polynomials). 


Hence, we may identify Qn(x) with Uy,(x), or with Uy,-1(x) according as 
¢u,(%) is, or is not, zero. If ¢n(«%) 0 for n=1, the degrees of the de- 
nominators of the successive convergents in (21) differ by one and all the 
Qn(z) are of the first degree, as in (6) above: 


(23) Az ++ from (20)). 


We take X, = By, for, by (22), 


-(3) 

5. The zeros of Un(x) compared with those of dni(z). Denote the 
zeros Of by tin (i= 1,2,---,n;n2Z1). By (12): 


*(1/es) generally stands for c,/x + ¢,/asti+... (c, 40). 


631 
‘ 
l- 


632 A. TARTLER. 


(%) $n hn < 0. 


Considering the sign of the product of the first two factors in the left-hand 
member of (24), we readily arrive at 


THEOREM III. The interval (ins, Vistnv1) contains either no zeros, or 
one zero, of Un(x), according as «@ is, or is not, an intertor point of it. 


Remark. (13) shows that if a is one of the zeros of $n41 (x), its remaining 
mn zeros are precisely those of Un(z). This case was excluded and is men- 
tioned here merely as a limiting case when a tends to a zero of ¢n,1(2) 


(Cf. § 6). 


Corotuary. If @ < OF > the zeros of Un(x) separate those 
Of dns (2). 
We proceed to investigate more closely the case when «@ is an interior 


point of one of the intervals (%ins1, Since 
Line < Lin < Vks1,n+1, it is convenient to consider two cases: 


In (12) put = 


Here > 0, — co <0 (see (7)), > 0; hence 


On(@ns1nn) <0 and Un(x) has one and only one zero in the interval 
+ co), 


(ii) Zin < We find in a similar manner that has 
one and only one zero tn the interval (— ©, Lins). 


We proceed further to specify the values of « for which the zeros of U,(2), 
for a given n, include either a or 6 (assumed to be finite). To this end 
consider (xz) = ©®,(2; dy.) for which 


“lg { (b — x) On(z) \ Sk<n). 


Hence, by virtue of Theorem I (uniqueness) : 


U,(2; (2 — dw) (6 — 2) (2; 


k,n 


ON A CERTAIN CLASS OF ORTHOGONAL POLYNOMIALS. 633 


Similarly we treat the point za by means of ®,(7; dy). In other words, 
if « is a zero of the polynomial ®n(2z; dy;,.), one of the zeros of Un(zx) 
coincides correspondingly with a or 6. This conclusion fully harmonizes with 
the inequalities and the results of §§ 2, 3. 


6. The zeros {Zin} of Un(x) as functions of a. 
THEOREM IV. {Zin} mcrease with a. 


Proof. Differentiate with respect to a the. relation Kn(Zin,«) =0 
(see (12)): 


OKn (05 a) 
0a 
n (Zin, 


Develop the right-hand member in (25), making use of (12): 


da (Zin, &) 


(9 (2, ¥) = (2) — (y) ). 


The desired result, namely 


will be established if we succeed in showing that 
(4, Fin) 9 (Zin, a) > 0. 
But this latter inequality follows from the readily verifiable identity 


9(2,y)9(y, =9(z)g(y) 
+ (Y) — (2) (2) hn (y) — (y) J, 


& 
(9(2) = g(a, 2) 
which leads to 


2 n n 


Remark. The above general theorem holds for any real «, inside or 
outside (a,b). 

The results of § 5, together with Theorem IV, are sufficient to describe 
completely the behavior of {Z;,,} as a varies increasingly from — o to + . 
This description is summarized in the following table. It will be recalled 


d 
(25) 
(t= 
se | | 
or 
dZin 
— 
ice 
val 
nd 
| 


634 A. TARTLER. 


(§ 2) that when @ is a zero of gn(xz), Un(x) does not exist, but Un_,(z) 
necessarily exists; at this point it is convenient to regard it as the polynomial 
U,(xz), with one of its zeros infinite. 


Lin 
< L141 < Lin < (4 nN) 
Lin < Viner < < (4 = +1; Tn+2,0+1 = b). 
a= Ein Lim = (i=—1,2,---,n—1), Enn=b 
Lin Inn =+ ©, OF © 
Lin << %< Fin Viner < Lin < (1 = 2, 3,°°° 
= Lin =, Lin = Nin (i= 2, 3,°°*, 
™Mmn % < < Fim Viner < Lin < (4 2, 3, 
Zin Lin =Ni,n (t= 2, 3,°°°,%) 
@ varies from Zin varies as from * to **, with 
proper changes of indices. 
< & Viner < Lin (4 =1,2,°°:, n). 


7. On the separation of the zeros of Un(x) and Un,(x). Here we use 
‘Ky(a, «) instead of Un(x). Consider first the case when a << @ < @iny:. Here 
(§ 5) the zeros Z;,n of Kn(x, separate the zeros of (2) : 


(26) 1,008 < Lin < Vener << Ton < Inn < < b. 


We note that if (26) holds for n = mp, it does so for n < mo, for the hypothesis 
% < implies < Xm, m <n -+ 1, by virtue of (7). Furthermore, 


(27) (Zin; (Ziss,n; — ( a) Pnsi (Zin) (Zis1,n) ? 
(28) Kau (24,n41) %) a) = K,, @) Kin a). 


The right-hand member of (27) being negative by virtue of (26), it follows 
that Kn,i(z,«) changes sign an odd number of times in each of the intervals 
(Zim, Moreover, the right-hand member of 
(28) being also negative, we conclude that it changes sign in each of these 
intervals only once. Hence, 


The same inequalities hold if 2nis.nu. < &. 


| 
am 
ye, 


ON A CERTAIN CLASS OF ORTHOGONAL POLYNOMIALS. 635 


Assume now that @ separates two of the zeros of dnii(%); say, 
Tens Here Kn(x,@) has no zero in and 
has one zero in each of the remaining intervals (%ijns1, Vist,nur) (UK). If 
in (27) 14k —1, its right-hand member is negative, and Kn,,(7, «) changes 
sign in the corresponding intervals (Zin, Zis1n) (tk) at least once. We 
now assert that Kn,,(z,@) changes sign twice in the interval (@-1,n, Tx,n) ; 
more precisely, it changes sign in each of the subintervals (Z-1,n, m+); 
Zen). In fact, 


Kuss a) = (Lk-1,n) hnsi(%), 
an $7 (%) (Lk-1,n) bn 


On+1 Lk,n+1 


Furthermore, since sgn = (— 1)", sgn dn (Liner) = (— 1) 


Raw (Zx-1,n; a) < 0, 
and similarly, 
(Zen; 0) < 0. 


The last two inequalities prove our assertion. Thus we state 


THEOREM V. (i) < OF > Umplies: the zeros of Un(zx) 
separate those of Unss(%)3 Tene < Umplies: each of the 
intervals (in, contains one zero and the interval 
Ten) contains two zeros of 


(The one remaining zero of Un,,(z) is either < Zn or > Znyn). 
It is known for the ordinary case that the zeros of ®,(x) for n very 
large are everywhere dense in (a,b), provided 


by 
(29) <b Sd) 
This, combined with Theorem V leads to the 


Corottary. Under (29) the zeros of Un(x) for n very large are every- 
where dense in (a,b). 


8. The mechanical quadratures formula related to Un(x). We consider 
the mechanical quadratures formula—a direct application of the Lagrange 
interpolation formula— 


al 
) 
). 
| 
| 


A. TARTLER. 


II 
{ 


a=; 


where {£;} denote n distinct points arbitrarily chosen. If we write 


Gay-s(2) = (2 —&) + (2) 


and make use of the orthogonality properties (10), we arrive at 


THEOREM VI. The mechanical quadratures formula (30) holds for 
Gon1(x) if, and only if, the points {; are zeros of Un(z). 


We thus get a formula of Gauss’ type 


b 
(31) Genes a) — (Zs) 
a (x — Zin) U'n( Zin) 
We get further, knowing that Zj.4 «, and taking in (31) successively 
Un(2) | Un (2) 
U 
for Z; 
>0 


Here again we find an essential difference between the case under con- 
sideration and the ordinary one (where all the coefficients in the mechanical 
quadratures formula of type (31) are positive). 

We proceed to derive an interesting expression for H ;,, in terms of K,(z). 
The orthogonality property (10) rewritten as 


Un(x) (a — a) \ dp (a) 0 


L— Fin 


shows the existence of a polynomial of degree n 


636 
| 


ON A CERTAIN CLASS OF ORTHOGONAL POLYNOMIALS. 


Un (x; dp) (x 


t— 


Un(2; (0 — dy) = 


? 


orthogonal in (a,b) with respect to the characteristic function 


Yin (¢ — Zin) dyp(t). 


We derive successively : 


Un(a; dy) Un(2; dim) (x — Zin) U'n (Ein; dy) On (Zin; din) 


Substitute in (31) and apply (12, 8, 3): 
Lin —— 

Kn (Hin; dp) 

(33) is another proof of the inequalities in (32). It also may give indication * 

as to the asymptotic behavior of Hin for n — oo. 


(33) 


9. Hatension of Darboux’ formulae. The recurrence relation (19) readily 
leads to 


V pn-1pnAns: | 
4. Un-1 (x) Un-2 (y) Un-2 (2) 


(©) Un(Yy) Uns (Y) Un (2) 
Gns1 


K,,(z, y) u(2)us(y) 


R, (2,2) (xz) = (2) Um (2) (2) (x) J. 


Thus Darboua’ formulae (8) hold in our case without any modification. 


10. A mechanical quadratures formula with a fixed interior point. Con- 
sider the mechanical quadratures formula 


I — ti) dy(a) 


n 


a=, 
where the points {£;} (‘= 0,1,- - -,m) are distinct and = « is arbitrarily 
fixed inside (a,b). We may show by the method of § 8 that (34) holds for 
Gon(x), provided gn(a) ~O (n=1,2,° Cin—=Zin (t= 
the zeros of U,(x), so that (see (33) ) 


* This could be illustrated by means of Hermite polynomials. 


637 


638 A. TARTLER. 


H; 1 1 
1,2,°°-,M), Hon = Fay 


Assume (a,b) to be finite. Since Zin (1—2,3,- --,n—1) and at 
least one of the two zeros Z1,n, n,n are always in (a,b), (35) shows that the 
corresponding H;,» tend to zero as n—> ©, in all cases for which it is known 
that Hin—->0. (See § 11 below). 

We get further, taking in (34) successively 


(x — Zin) (Zi,n a) (Zin) 
(t= 1, 2,- 


We see that here all Hi,» are positive. 


11. Tchebycheff inequalities related to Un(x). Denote the set of points 
@, {Zin} (t= by < << < Change the 
numbering of H;,, in (34) accordingly and rewrite it as 


Gon (x) d(x) (Yi,ns1)- 


Following Stieltjes * (and Markoff), construct Gon(xz) subject to the following 
conditions : 


Gon (Yi,ns1) =] (1=—1, 
Gon(Yinu) =0 
Con = 0 


These 2n + 1 conditions determine @,,(z) uniquely. Moreover, G’on(a) has 
nm zeros at the points yinu (i—1,2,---,k—1,k+1,---,n+1) and, 
by Rolle’s theorem, & — 1 zeros inside (Yi,ns1, Yis1,nu1) (1 = 1, 2,° —1), 
and n—k zeros inside (Yins, Yisinu) k+1,---, n+1), with 
n+ (k—1) +n—k=2n—1. It follows readily that 


Gon(z) 2 0 for all z, 21 for yoann, 


* Stieltjes, “ Quelques recherches sur les quadratures dites mécaniques,” @uvres, 
vol. 1, pp. 377-396. 


| 


ts 


8, 


ON A CERTAIN CLASS OF ORTHOGONAL POLYNOMIALS. 


b 
+ + J Gan (a) dy (2) 
a a 
if 
\ if 
(37) = f° 
if > 6. 


Similarly, using the polynomial T2,(a) such that 


Ton (Yi,ns1) = 0 (1=—1, 
Ton (Yi,ns1) =1 
b 
(38) + Hise Aner =f dy (x) >, 0); 


and combining this with (36) : 


The inequalities (37,39) constitute an extension to our case of the important 
Tchebycheff inequalities. It follows readily that 


(40) Hes (°° 


Y2n+1 b 
H, <{ ei, =f dy (2). 
a of Yn,n+1 
Hence, if w(x) is continuous in the finite interval (a,b) which contains no 


by 
subinterval (a,,6,) such that dy(x) = 0, then (see Corollary to Theorem 


a 
Hip 0 a now (i—1, By virtue of (35) we infer 
that Kn(a) — co as m—> o for any fixed a This result combined with a 
theorem due to Hamburger® gives a direct and elementary proof of the im- 
portant fact that the moment problem for a finite interval 1s determined. 


12. On the associated continued fraction. 


THEOREM VI. If r(x) is a continuous function having s changes of sign 
between a and b and (a) is of the nature indicated, then in the associated 
continued fraction 

°H. Hamburger, “tber eine Erweiterung des Stieltjesschen Momentenproblem,” 
Mathematische Annalen, vol. 81 (1920), pp. 235-319, Theorem XVII. 
12 


639 
at 
é 
n 
| 
As 
1, 
)s 
h 


640 A. TARTLER. 


the degrees of the polynomials qi(x) (1 =1,2,- cannot exceed s +1. 


Proof. We have formally, denoting the i-th convergent of (41) by 


n=0 L 


and expanding the left-hand member: 
+ + Hap, =O (7 =0,1,°* +, pin —2), 


which is equivalent to 


b 
Were a certain gi(x) in (41) of degree > s+ 1, we would have 

Pin mi > S +1, Hin —2 


and we could render (42) impossible by choosing Gy,,,-2(@) in (42) so that 
r(r)Gy,,.-2(%) 20 forasrSb. The results of §13 below show that the 
upper bound for the degree of qi(x) as given in Theorem VI is the best 
possible. 


13. The case ¢n(a) =0. Here (§ 2) Un(x) does not exist, Un_1(x) and 
Unii(x), however, necessarily exist (since 540). Moreover, 
by (13°), 


b 
We get, writing 


(43) = (2? + + Ener) + Pn-2(2) 


(Pn-2(x) — polynomial of degree = n — 2): 
(44) P,-2(2) Ge-s(2) — dp(2) —0 (see (10)). 
(44) can be satisfied in the following cases only 
(i) Pno(x) =0; (ii) Pn-2(@) if dn-2(a) 
(111) if = = 0. 
(i) is impossible. In fact, it leads, through (13’, 10), to 


= b 
while on the other hand by (18), 


hat 


ON A CERTAIN CLASS OF ORTHOGONAL POLYNOMIALS. 641 


(ii) Here (43) becomes 


(45) Onn (x) (2* + + ) (x) n-2 (2). 
We find as above (see (20)): 


b 
2"U (x) dy (x) 


— 


(2) dy(2) 


— 


Multiplying (45) by 2"(a—a)dy(x), integrating, and making use of 
(10, 13, 2), we get further: 


On+2,n An,n-2 
On+2 


T An-1 n-1(%) — Snr + = 0. 


Furthermore, using (12), we get 


An-2n-2(%) 
1(@) 


An,n-2 On+2,n 
Cns1 — [ Sn |_— [ Sa 
An+2 


An-2n-2 (%) 
An-1Pn-1 (a) 
tn the so-called “ symmetric ” case y(— 2) ; (a,b)=(—A,h)) 


Sn-1 — Buus 


Cn+1 + (Snort — Sx-1). 


An,n-2 
Sn = Cn = 0 (n 21), 


so that an i=2 


Pn- 


— ~ Vike Ener = — — Aner (“ symmetric ” case). 
(ili) = = 0. Proceeding as before, we get: 
- 
An” 
On,n-2 On+2,n 
Cns1 [ Sn [s Inst S nae + Sn-1) ; 
n+2 


0; = Ans (“symmetric ” case). 


| 

the 
est 
ind 

0; 
0; 


642 A. TARTLER. 
If =0, dnse(%) 0, we write 


b 
+ Pa(a),and (a—a) dy (2) —0, 
Hence, 
(46) P, (2) =— InsoU (@), = (2 — Uns (2), 


and as above 


An® ns2(%) Ansi Pnsi(%) 

In order to illustrate we take in Theorem VI r(x) =2z— a and assume 
$1(%)do(a) 40, Then the polynomials 
(t= 1,2,---,k—1) in (41) are all of the first degree, while qx(x) ts of 
the second degree. Correspondingly, the recurrence relation (19) holds for 
(n=1,2,---,k—1); for n=k-+-1 its character changes as indicated 
under (ii), (iii). (For nk, (19) does not exist). 


= 


Case (iii) is possible. This is evident in the symmetric case with « = 0, 
for here = 0 (n —1,2,---) so that all the qi(z) in (41) are of 
degree 2. This shows that the upper bound for the degree of qi(z) as given 
in Theorem VI is actually attained in this case. 


14. A minimum property of Un(x). Among all polynomials G,(ax) such 


Ke) 


b 
which minimizes the integral { Gn?(x)dy(x), with the minimum 
a 


that Gn(«) =1 (gn(a) #0), it is the polynomial 


dy) 
The proof can be easily accomplished by using the methods of constrained 


extrema. 


15. The case of two changes of sign. We wish to investigate the existence 
of a system of polynomials 


=2"+- (n = 0, 1,- 
satisfying the condition of orthogonality 


(a<a< a, <b). 
We get as above (see § 2), writing 


(2 — a2) Vn (2) = Aidi(2) : 


= 


matrix and use (5,2): 


ON A CERTAIN CLASS OF ORTHOGONAL POLYNOMIALS. 


(41) + Andn(%) = — 
n+ 


The determinant of (48) is (a, — Kn( a1, 


The condition Kn(a, %) 0 is thus seen to be sufficient for the existence 
of Vn(z) satisfying (47). Furthermore, this condition insures the unique 
determination of Vn(z) in the form 


(2) 


an 
— Ae) Kn (T— M2) (Ge) on (ae) 


= 


In particular, if Kn(a1, %) £0 and gni2(%1) = (%) = 0, 


(x) 


= 


Moreover, in this latter case (47) is replaced by 


On the other hand, if Kn(a,, a.) = 0, the consistency of the system (48) 
requires the matrix 


to be of rank one. We proceed to show that this is possible. In the first place, 
we must have 


nso nsi (%1) gn 
( >) Pnsi (a) fn ( 


n+2\ % n+1 | AOn+2 


which, combined with the assumed relation Kn(%,@.) = 0, gives 


( (%2) = (), 


If we assume now that dni: (a1) == 0 (hence ¢n(a,) £0), then (8) shows that 
$nii(%) = 0. Conversely, assuming dnii(%1) = 0, we get at once 
= (01, 2) = 0. 

In the second place, consider the last determinant of order 2 of our 


643 
-(), 
), 
ne 
of 
0, 
of 
r) 


A. TARTLER. 


(91 — — (41) gn (%) 


gn 


On+1 
Hence, if (41) = (%2) 0, the determination of Vn(z) by means of 
(48) is no longer unique. In this case An,, in (48) may be assigned arbi- 
trarily. These considerations lead to 


THEOREM VII. A necessary and sufficient condition that a uniquely 


determined polynomial, of a gwen degree n, Vn(x) =a" +-- -, satisfying 
(47), exist, is: Kn 40. Moreover, tf Kn(a1,%2) 40 for all 
there exists a uniquely determined sequence {Vn(x)} (n=0,1,- - +) of such 
polynomials. 


The polynomial V,(a2), may have two (but not more than two) imaginary 
or equal zeros, or two (but not more than two) zeros outside the interval (a, b). 
To show this construct the polynomial ©,(27; + r?)dy) =2"+--., 
(r arbitrary real constant), orthogonal with respect to the monotonic char- 


acteristic function f : (t? + 1r?)dy(t), and write the orthogonality property 
in the form 


b . : 2 2 di 2 2 
where (@,%2) are zeros of ®,(x; (z?-+17)dy). (49) shows the existence 


of a polynomial (0° 4+ (2° + 
(50) Val(z) = = 7" + 


orthogonal with respect to the characteristic function 


Furthermore, we readily see that here the determination of Vn(z) is unique 
since this is known to be true for ®,(2; (2? -+ 1?)dy). Hence, (50) shows 
that with such choice of %,%2, V,(x) has two imaginary zeros. In like 
manner we show the possibility of the existence of two equal zeros or of two 


zeros outside (a,b). 

The case of s (s > 2) changes of sign could be treated as above. Since, 
however, even for two changes of sign the most important properties of the 
zeros of orthogonal polynomials corresponding to monotonic characteristic 
functions no longer hold, the discussion of this case is omitted. 


UNIVERSITY OF PENNSYLVANIA. 


1° There is an infinity of such a,, a, in any subinterval of (a, b). 


644 


METABELIAN GROUPS AND PENCILS OF BILINEAR FORMS. 


By H. R. BRaAHANA. 


Introduction. In a recent paper * it was shown that the problem of classi- 
fication of metabelian groups of order p"*™ which contain a given abelian 
group of order p” as a maximal invariant abelian subgroup and have com- 
mutator subgroups of order p™ is equivalent to the problem of classification 
of the matrices 7,M, + 2M, under projective transformations 
on the z’s and elementary transformations on the square matrices M,, M2,---, Mx. 
The z’s and the elements of the M’s are of course numbers in a modular field 
as are also the coefficients of the transformations. The squareness of the 
matrices comes from the requirement that the commutator subgroup be of 
order p”. The situation may then be discussed in terms of the invariant 
factors of the matrix 7,M@, + ++ 

The argument of that paper still holds when the commutator subgroup 
is not of order p” and the matrices M; are not square. In this case however 
we are deprived of the use of a well-developed theory of invariant factors. 
So far as I know the question of the conjugacy of two matrices of the above 
type under transformation on the 2’s and simultaneous transformations on 
rectangular M’s has not been considered. It is our purpose to consider the 
groups which give rise to such matrices in the simple case where m = 4 and 
k = 2 and to use the results to obtain normal forms for the matrices. It will 
be convenient to interpret the matrices M; as matrices of bilinear forms in 
which case the matrix above, which we shall denote hereafter as A,M, + AM2, 
may be taken to represent a pencil of bilinear forms. 


1. The groups. We consider groups G = {H,U} where H is abelian, 
of order p”, and type 1,1,--- and U is an abelian group of order p* and 
type 1,1,- - - from the group of isomorphisms of H. We require that no 
operator of U, except identity, be permutable with every operator of H. We 
require further that G be metabelian which implies that its commutator sub- 
group is in its central, that every operator of U determines? a partition of n 
with greatest term equal to 2. Finally, we require that no operator of U 


*“Metabelian groups of order pr+m with commutator subgroups of order pm,” 
Transactions of the American Mathematical Society, vol. 36 (1934), pp. 776-792. 

*“On metabelian groups,” American Journal of Mathematics, vol. 56 (1934), 
pp. 490-510. 


645 


of 
ly 

10 

h 

y 

y 

3 


646 H. R. BRAHANA. 


determine .a partition of n in which more than two terms are equal to 2. 
An operator U; will be said to be of type I or type II according as the 
partition it determines contains one or two 2’s. The group {H,U;} will be 
said to be of type I or type II depending on the type of U;. In the groups 
which we shall consider U will contain only operators of types I and II, the 
identity excepted. Since the groups in which U contains only operators of 
type I were classified in the paper just referred to we shall suppose that U 
contains at least one operator of type II. 

The central of G under these conditions is of order p"* and we may 
suppose that generators of H are chosen so that all but two are in the central. 
Let the two of the generators of H which are not in the central be denoted 
by s, and s.. Then if U; is an operator of U, {H, U;} will have a commutator 
subgroup of order p or p” according as it is of type I or type II. The maxi- 
mum order for the commutator subgroup of G is p* and occurs only if each 
of the operators of every set of four which generate U is of type II and the 
resulting eight commutators are independent. The commutator subgroup of 
G has an order at least p* since U contains at least one operator of type II. 
Since the commutator subgroup is characteristic we may separate the groups 
in question into classes according to the orders of their commutator subgroups 
and no two groups belonging to different classes can be simply isomorphic. 

Let the order of the commutator subgroup be p’. It is immediately 
obvious that there is but one group in the class corresponding to / = 8, for 
a simple isomorphism is established between any two such groups by a proper 
naming of generators. These considerations give the following more general 
theorem : 


(1.1) There is but one metabelian group G = {H,U} of order p"*™ with 
commutator subgroup of order p?”, provided the operators of U are restricted 
to types I and II, In this group no operator of U is of type I. 


If two groups U and U’ have different numbers of subgroups composed 
of operators of type I, there will exist no simple isomorphism between {H, U} 
and {H,U’} in which H corresponds to itself. Accordingly, for 1 < 8 we 
may consider the groups in sets determined by the number of subgroups of 
type I in U. 

When 7 = 7, U cannot contain more than one subgroup of type I. Other- 
wise two operators U, and U, both of type I could be selected as two of the 


*We shall postpone the question of simple isomorphisms between {H,U} and 
{H, U’} in which H does not correspond to itself. 


METABELIAN GROUPS AND PENCILS OF BILINEAR FORMS. 647 


four independent generators of U. The group {H,U,,U2} would have a 
commutator subgroup of order p?. The commutator subgroup of {H, U3, U4} 
could be of order at most p* and hence the commutator subgroup of G could 
be of order at most p*. If U contains one subgroup of type I, let us suppose 
it to be generated by U, and generators of H chosen so that U, is permutable 
with s.. Then {H, U., U;, U,} must have a commutator subgroup of order p® 
and by (1.1) just one such group exists. Consequently there exists such a 
group and it is completely determined by the requirements that 17 and 
that U contain but one subgroup of type I.* If U contains no operator of 
type I there is also but one group. No matter how U,,- - -,U,4 are chosen, 
so long as they generate U, the group {H, U,, U2} will have a commutator 
subgroup of order p* or p* and in the former case {H, U3, U4} will have a 
commutator subgroup of order p*. We may then assume that the commutator 
subgroup K’ of {H,U,, U2} is of order p*. Then at most one of the com- 
mutators arising from transformation of H by U; is in K’ and if so the two 
commutators arising from U, are independent of K’. Hence we may choose 
an operator U; such that the commutator subgroup K” of {H, U;, U2, U3} 
is of order p®. Then of the two commutators arising from transformation of 
H by U, at most one is in K”. From the symmetry in s, and s, of the 
telations which generators of G just described satisfy it is clear that no 
restriction is introduced by assuming that the commutator of U, and s, is in 
Kk”, Let us denote this commutator by s;, and the commutator of U, and s, 
by 8. If s, is not in K” it is in {K”,s,}. If then we replace s, by a proper 
combination of s, and s, we obtain a commutator s‘, which is in K”. We may 
assume further that s,; is in the part of K” which arises from transformation 
of s. by U,, U2, and U3, for otherwise U, could be replaced by such a com- 
bination of U,, U2, U;, and U, that such would be the case. Therefore there 
exists in {U,, U2, U;} an operator U’ whose commutator with sz is s,. This 
operator may be taken to be U,, and consequently we may assume that the 
commutator of UV, and s, is the same as that of U, and s,. There are therefore 
two groups for / — 7 and they are distinguished by the numbers of subgroups 
of type I in U. These considerations also apply more generally to give the 
theorem : 


(1.2) If the operators of U are all of types I and II, there are two and only 
two groups of the type we are considering of order p*™ with commutator sub- 
groups of order p?"-*, In one of them U contains one subgroup of type I, 
and in the other none. 


“We use the shorter expression “subgroup of type I” in place of “subgroup 
composed of operators of type I. and the identity.” 


648 H. R. BRAHANA. 


When / 6 the number of independent subgroups of type I in U can 
be at most two as may be seen from an argument similar to that used for 
1==%, We shall see that U may contain more than two subgroups of type I 
but that if so all such subgroups are contained in the group generated by two 
of them. The possibilities for the number of independent subgroups of type I 
are therefore 2, 1, and 0. 

If U contains two independent subgroups of type I, let them be generated 
by U, and U,. Two possibilities arise: the subgroups of {s,, s2} permutable 
with the respective operators U; and U, may be the same or they may be 
different. If they are the same every operator of {U3,U,} is of type I; if 
they are different every operator of {U;, U,} except those in {U3} and {U4} is of 
type II. In either case since the commutator subgroup of {H, Ui, U2} is of 
order at most p*, the commutator subgroup of {H, U;, U,} must be of order p’. 
In either case every operator of U not in {U3, U4} is of type II. Hence, in 
the one case U contains 1 + p subgroups of type I, and in the other contains 
two subgroups of type I. The two groups are generated by operators 
$1, 82, 83° * Sn, Us," +, U4 which satisfy the following relations: when U 
contains 1 + p subgroups of type I, 


(1 3) U,*2,U, = 8,83, = U,"'8, U; = = 
U = 8284, 2 = 8286; 


when U contains two subgroups of type I, 


U;, =— 8183, $185, U,"'8,U; $487, 
= 8284, 3 = 8286, U 471820 = 828s. 


If U contains one subgroup of type I let it be generated by U, and let 
U, be permutable with s.. The commutator subgroup of {H, U;, U2, U3} must 
be of order p® or p*, and {U,,U2,U;} can contain no operator of type I. 
By theorems (1.1) and (1.2) there is but one group in each case. In the 
former case the group is completely determined, since the commutators of 
{H, U,} and {H, U,, U2, U3} are independent. A set of generating relations 
is obtained by adding U,7's,.U; = ss, to (1.3) above. In the second case 
we may obtain such a group by adding the relation U3-'s.U3 = ss, to (1.3). 
It then follows that s, is in the group {53, 84, 85, 86, $7, sx}. The group 
{83,85 87,83} must be of order p*, otherwise a proper combination U’ of 
U,, U2, and U; would transform s, into s,s, and U’U," would be of type I 
contrary to the assumption that U contained no operator of type I besides U.. 
At least two of the operators s,, s., and s;, are independent of {53, $5, Sz, 88}; 
we may then suppose that sz is expressible in terms of 83, 84, 85, 86, $1, and 5%. 


METABELIAN GROUPS AND PENCILS OF BILINEAR FORMS. 649 


If the group which we are considering is distinct from the one just previously 
described, we must expect at least two possibilities to appear, for if in 
the former group we replace U, by U,U, the commutator subgroup of 
{H, Ui, U2,U;} is of order p®, being generated by 545s, S4, 85, Se, 87, and 53. 
There are then at least these two possibilities in the present case: (a) sg is in 
the group {83, $4, $5, Se, $7, and not in sx}, or (b) sg is in {84, Sx}. 
In case (b) we may assume that s; = sz, for every operator of {54, So, s,} is a 
commutator, and any set of three independent operators of {U,, U2, U3} 
generate it. The operator U, is a unique operator in U, it defines s, to within 
a power of a single operator of {s,,s2}, and s2 in turn defines {584, sg, sx}. 
Hence, this group is distinct from the one previously defined, in which sg is 
not in {84, 86, S,} and as we have seen may lead to case (a). We complete 
this by showing that case (a) leads to a single group, consequently one in 
which U,, U,, U; can be chosen so that {H, U,, U2, U3} has a commutator sub- 
group of order If s, is in {83,- 87, 8} but not in {84, s¢, then there 
exists in {U,, U2, Uz} an operator U’, and in {s,s} an operator s’, such that 
the commutator of U’, and s’; is sxsg. This operator s’, is obviously not a 
power of s.. The operator U’, is not Us; for in that case U;U, would give 
the same commutator with s’, as with s, and hence would be of type I. Con- 
sequently, if we replace U, by U’,U," and s, by s’;, generators of the group 
satisfy the relations found for the first group with one subgroup of type I. 

The two groups each containing one subgroup of type I just described 
have been distinguished by the order of the group {s4, 86, sx, 8s} which in one 
case is p* and in the other p*. They may also be distinguished from each other 
by the non-abelian subgroups which they contain. In one there exists at least 
one subgroup {H,U,;,U2,U;3} of order p"** with commutator subgroup of 
order p> which contains no subgroup of type I; in the other every subgroup 
of order p"** with commutator subgroup of order p> contains a subgroup of 
type I. Thus there is no simple isomorphism between the two groups in which 
H corresponds to 

Now suppose that U contains no operator of type I. Two possibilities are 
immediately evident: (a) U contains two operators U, and U, such that 
{H,U,,U.} has a commutator subgroup of order p?; or (b) U contains no 


°We beg leave to point out that all the operators of type II in the group of 
isomorphisms of H are conjugate, as are all the operators of type I. The two groups 
U and U’ are abelian, of order p*, and type 1,1,- - - and every operator of one can be 
transformed into many operators of the other by operators which transform H into 
itself. U may be transformed into U’ in many ways. U may not, however, be trans- 
formed into U’ by any operator which leaves H invariant. 


‘a 
| 
. q 
: 
4 | 


650 H. R. BRAHANA. 


such group. The condition (a) completely determines the group, for the 
commutator subgroup of {H, U;,U,} must then be of order p* and must be 
independent of the commutator subgroup of {H,U;,U2}. In any case, no 
matter how s, and 8s, are chosen the commutator subgroups H, and H; arising 
from transformation of s, and s, respectively by U are of order p*, since U 
contains no operator of type I. Since {H,, H.} is of order p®*, these two groups 
have a subgroup of order p* in common. This subgroup determines two 
operators U, and U, such that the commutator subgroup arising from trans- 
formation of s, by {U,, U2} is the subgroup common to H;, and H;; it determines 
two other operators V; and V, such that the commutator subgroup arising from 
transformation of s. by {V:, V2} is the same group. In case (a) the group 
{U,, U2, Vi, V2} is of order p*. In case (b) this group is of order at least p’. 
In case it is of order p*, the four commutators arising from transformation 
of s, by V, and V, and of sz by U, and U2 must be independent of the com- 
mutator subgroup common to H, and H».. The group is therefore completely 
determined. If the group {U,, U2, V;, V2} is of order p*, it is generated by 
three operators U,, U2, and U; and its commutator subgroup is of order p*. 
Then the commutator subgroup of {H, U,} must be independent of this group 
of order p*, and since there is but one group of order p"** with commutator 
subgroup of order p* and no subgroups of type I ® it is completely determined. 
It is necessary to determine whether or not these last two groups are distinct. 
This last group contains a subgroup of order p"** with commutator subgroup 
of order p* and the preceding group does not. 

Normal forms for generating relations of the three groups with com- 
mutator subgroups of order p® and no subgroups of type J are: 


U,*8,U, = 8,84", Us = 5,85, 4 = 8,57, 
= S884, 2 = $283, U,*8.U; = U418.U 4 = 
where r is any not-square. This group contains a subgroup of order p”*? with 

commutator subgroup of order p’. 

= $183, U2 = U,7*8; Us = == 
= 8284, 2 = 8283, = 8285, = 825s. 
This group contains no subgroup of order p"*? with commutator subgroup 
of order p’, but contains one of order p"** with commutator subgroup of 

order p*. 
U,-18,0; = S83, U.18,U. = S85, 8, Us; = $87, U4 — 8188, 
= 8285, U 278202 = 8286, 3 = 8283, = 825s. 


*“On metabelian groups,” loc. cit., p. 510. 


l- 


METABELIAN GROUPS AND PENCILS OF BILINEAR FORMS. 651 


This group contains no subgroup of order p"** with commutator subgroup of 
order p*. 

As in the cases of / = 8 and / = 7, an obvious rewording of the argument 
above gives the more general theorem: 


(1.4) If the operators of U are all of types I and II there are seven and 
only seven groups of the type we are considering of order p™™ with com- 
mutator subgroups of order p?"-?, They may be characterized by their non- 
abelian subgroups of orders p"**, and 


When / = 5, the number of independent subgroups of type I in U cannot 
be greater than 3, and may be 3, 2, 1, or 0. If there are 3 independent sub- 
groups of type I in U we may take them to be generated by U2, U3, and U4. 
The operator U, must be of type II. The commutator subgroup of 
{H, U,, U;, Us} must be of order p* and must have no operator in common 
with the commutator subgroup of {H,U,}. The groups will then be char- 
acterized by the properties of {H, Uz, Uz, Us}. Each of the operators U2, Us, 
and U, is permutable with a subgroup of order p of {s,,s.}. These three 
subgroups may be the same; two of them may coincide; or all three may be 
distinct. If the three subgroups coincide let us suppose the subgroup to be 
generated by s.. Then generators of G will satisfy the relations: 


= S83, Us $185, Ue Us om $186, U,18,U, = $87. 
— 8084. 


In this case every operator of {U2, U3, U4} will be of type I, and G wiil contain 
1+ p+ p® subgroups of type I. 

In the second case the two subgroups of {s,, 82} permutable with operators 
of type I of U may be taken to be generated by s, and s2. In this case genera- 
tors of G will satisfy relations the same as those above except that 
U,"s,U, = 8,8; is replaced by U4-1s.U4 = 828;. The only operators of type I 
in {U,, U;, U,} are the operators of {U.,U;} and powers of U,. G therefore 
contains 2 + p subgroups of type I. 

In the third case we may suppose that two of the groups are generated 
by s, and s, respectively. Generators of G will satisfy the relations: 


U,*2,U, == $084, U 37182 U; = S86, = $87. 


It is obvious that the group {U2,U;,U.} contains but three subgroups of 
type I, and that @ likewise contains but three subgroups of type I. 


| 

e 

o 
ig 

0 
s- 
BS 

a 
3 | 
n § 
y 

y | 
4 

P 

f 


652 H. R. BRAHANA. 


When U contains but two independent subgroups of type I the operators 
of {8,82} permutable respectively with these two subgroups of U may con- 
stitute one or two subgroups of order p. If they constitute one subgroup of 
order p, let it be generated by s2.. Let U; and U, generate the respective 
subgroups of type I. Then {U,, U2} can contain only operators of type II. 
The commutator subgroup of {H, U,, U2} is of order p* or p*. There is but 
one group for each of the orders p* and p* having the required properties. 
If the order of the commutator subgroup of {H,U,, U2} is p* the group G 
is completely determined. Its generators satisfy the relations: 


U2, = $83, U. = U; = = $187, 


If the commutator subgroup of {H, U,, U.} is of order p*, then it must have 
a subgroup of order p in common with the commutator subgroup of {H, U3, U4}. 
This common subgroup may be taken to be in the group of commutators 
arising from transformation of s. by U; and U2, for otherwise U’, and U’, 
could be chosen so that the commutator subgroup of {H, U’,, U’2} would be 
of order p*. The group G in this case is also completely determined. Its 
generators satisfy relations obtained from these above by changing the com- 
mutator of U, and sz from s; to ss. The two groups are obviously distinct ; 
each contains 1 + p subgroups of type I. They may be distinguished by the 
fact that the first contains a subgroup of order p"*? with commutator subgroup 
of order p* and no subgroup of type I, whereas in the second every subgroup 
of order p"*? with commutator subgroup of order p® contains subgroups of 
type I. 

If the subgroups of {s,,s.} permutable respectively with U,; and U, are 
distinct let them be generated by s, and s,. The commutator subgroups of 
{H,U;} and {H, U4} may coincide, in which case the commutator subgroup 
of {H, U,, U2} must be of order p* and independent of it. Generators of G 
therefore satisfy the relations: 


U,18,0, = 8,83, = 8,85, = 8187, 
S284, U.718.U 2 = S886, 4 = 


In this case G contains 1+ p subgroups of type I. The group is obviously 
distinct from the three preceding ones. 

If the commutator subgroup of {H, U3, U4} is of order p*, there are but 
two subgroups of type I in {U;, U,} and therefore but two in U. The com- 
mutator subgroup of {H, U,, U2} is of order p* or p*. In the first case genera- 
tors of G satisfy the relations: 


METABELIAN GROUPS AND PENCILS OF BILINEAR FORMS. 


= 8284, 2 = 828s, 4 = 8287. 


In the other case generators of G satisfy the above relations with ss; replaced 
by s, in the transform of s. by U2. The operator s, cannot be in the group 
{83, S4, 85}, but must be in the group {5s, 84, 85, Se, 87}. We may assume that 
the expression for s, in terms of these operators does not contain s, or s;. 
It is therefore in the group {s3, 85,8}. There is an operator in the group 
{U,,U2,U;} whose commutator with s, is s;, and this operator is not U2. 
If it is not Us, it may serve in place of U, and generators of G satisfy the 
relations above. If it is U3, we have a new group whose generators satisfy 
the above relations with s, for the commutator of Uz and s2. This last group 
contains no subgroup of order p"** with commutator subgroup of order p’ 
except those which contain subgroups of type I; the former group does 
contain such subgroups. 

When JU contains but one subgroup of type I let it be generated by U, 
and let the subgroup of {s,, s2.} permutable with U, be generated by sz. Let 
= The commutator subgroup of {H,U;,U2,U3} is then of 
order p* or p®. If it is of order p* it does not contain s;, The two groups of 
commutators, H, and Hz», obtained by transforming s, and s, respectively 
by {U,, U., Uz} have a subgroup of order p? in common. This subgroup may 
be taken to be {s3,s,} and it defines two operators U, and U, which give 
commutators s; and s, with s, and two operators V,; and V2 which give com- 
mutators sz; and s, with s,.. The group {U;, U2, Vi, V2} is of order p? or p’. 
We have the following two groups: 


U,18,U0, = 8183, = 8184, Us = 8185, = 8,87, 
S284, — S283", ; = 


This group contains a subgroup of order p"*? with commutator subgroup of 


order 


U,"3,0, = 8183, 2 = 8,84, = 8184, = $187, 
= 8285, 2 = 8283, Us"'82U 3 = 8284. 


This group contains no subgroup of order p"*? with commutator subgroup of 
order p?. These are the only two groups when U,, U2, and U; can be selected 
so that s,; is not in the commutator subgroup of {H, U,, U2, U3}. 

If G contains no subgroup {H, U,, U.,U3} with commutator subgroup 
of order p*, we may assume that H, and H, have a subgroup of order p in 
common. Generators of G will satisfy the relations: 


653 
on- 
of 
ive 
IT. 
es. 
ve 
}. 
| 
be 
ts 
t; 
he 
Pp 
of 
of 
y 
t 


654 H. R. BRAHANA. 


= $183, $185, U,'3,U = = S887, 
U, S284, S283, U 18.0, = SoSx. 


The operator s, is in the group {s3,: - -,87;} and it is not in the group 
{S3, 84, 85, 8}. In the expression for s, neither s, nor s, need appear. It may 
therefore be assumed to be in the group {85, 8,87}, but not in the group 
{8s, 8}. There exists in {U., U;, U,} an operator U’, whose commutator with 
8, is s;. This operator is not U; and may be used in place of Uz if it is not U,. 
If it is not U, then H, and H, have the group {83, s,} of order p? in common, 
which brings us back to the group previously described. We may then assume 
in this case that s,—=s;. There are thus three distinct groups with com- 
mutator subgroup of order p*, each containing one subgroup of type I. They 
are distinguished by means of their subgroups of orders p"*? and p”**. 

When U contains no operator of type I the two groups H, and H, are 
each of order p* and consequently have a subgroup of order p* in common. 
This subgroup determines three operators U,, U2, and Us; such that the com- 
mutator subgroup arising from transformation of s; by {U1, U2, U3} is the 
common subgroup of H, and H»,. There are likewise three operators V,, V2, 
and V, determined by sz and the common subgroup. The order of {U,,---, V3} 
is either p* or p*. If it is of order p* it is generated by U,, U2, and U3. 
There is but one’ such group {H, U;, Uz, Us} with commutator subgroup of 
order p*, and since its commutator subgroup must be independent of that 
of {H, U,} the group G is completely determined. Its generators satisfy the 
relations : 

U,18,U, = 38,83, 8,85, U = 8,8,%8,, U.-18,0, = 5186 
, = 8284, = 8283, Us"'82U 3 = 8285, 4 = 8257, 
where z* — ax + 8 =0 is irreducible, mod p. 

If the order of {U;,- - -, Vs} is p* two possibilities arise according as 
{U,,: +, Vs} does or does not contain a subgroup such that {H, U’:, U’2} 
has a commutator subgroup of order p*. In the first case generators of @ 
satisfy the relations: 

U,'s,U, = 8,83, 8:85, 4 = 3157, 

= 8284, U2 2 = 8283, Us" = 8284, 4 = 828s. 
In the second case generators of G satisfy the relations: 

U,-18,U, = 8,8,, = 8,8, U4-18,U04 = 8,57; 

U 171820, == 8284, 2 = 8283, = 8285, U4" = 828e. 


There are thus thirteen groups for / = 5. 


7 Cf. “On metabelian groups.” loc. cit. 


METABELIAN GROUPS AND PENCILS OF BILINEAR FORMS. 655 


In the case where / = 4 we shall use the results of the paper * on metabelian 
groups of order p"*” with commutator subgroups of order p”. Two matrices 
M and WN are used to describe the commutator structure of {81, 83, 54,° * * 5 Sn; 
U,,:°*,U4} and {82,83,° *,8n, +,U4} respectively. If the com- 
mutator subgroup of G is {8s, 84, 85, 86}, the element mi; of M is the exponent 
of Sj in the commutator of U; and s,; and the element nj of N is the 
exponent of sj;,. in the commutator of U; and s.. The groups G may then 
be classified according to the classification of the matrices M+ AN under 
elementary transformations on the matrices M and WN simultaneously and 
projective transformations on A, both sorts of transformation having coeffi- 
cients in the modular field, mod p. 

Let us consider the determinant f(A) =| M-+ AN |, and suppose that 
f(A) is the product of four linear factors. If A, is a root of f(A) =0 then 
$8.1 is permutable with some operator of U. If all four roots of f(A) =0 
were the same we could take A, to be zero and every element of M would be 
zero. This would imply that s, was permutable with every operator of U in 
which case U would contain only operators of type J. Since we assume U to 
contain at least one operator of type II, f(A) cannot be the fourth power of a 
linear expression in A. We may then suppose that at least two of the linear 
factors of f(A) are distinct and that their zeros are 0 and «. The determi- 
nants of M and WN are then both zero. If we consider first the case where 
f(A) has just two distinct zeros we have the two possibilities (a) f(A) = A® 
and (b) f(A) =A’. In the first case three of the U’s are permutable with s, 
and one with s, so that G contains 2 + p-+ p? subgroups of type IJ. In the 
second case two U’s are permutable with each so that G contains 2(1 + p) 
subgroups of type I. 

When f(A) is the product of four linear factors three of which are distinct 
we may suppose its zeros to be 0, 1, and « with 0 counted twice. Then two 
U’s are permutable with s,;. In this case G contains 3 + p subgroups of type I. 

When f(A) has four distinct zeros they may be taken to be 0, 1, 0, 
and p, where p is the cross-ratio of the four taken in some arbitrary order. 
There are as many such groups as there are projectively distinct unordered 
sets of four points on the finite line, mod p. The cases where p = 0, 1, and 
give the group described just above. The values 2, —1, 1/2 give a single 


_ group, and the two primitive cube roots of — 1, when they exist in the modular 


field, give a single group. The rest of the numbers in the modular field go 
in sets of six to determine a single group. There are therefore (p + 1)/6 or 


® Loc. cit. 


13 


up 

ay 

Up 

ith 

U,. 

on, 

me 

m- 

ley 

are | 

m- | 

he 

| 

7 

25 

| 

/ 

of | 

he 

as 

a} 

|_| 


656 H. R. BRAHANA. 


(p + 5) /6, according as p is of the form 6k — 1 or 6k + 1, groups which are 
distinct from each other and from the groups considered above. Each has 
four subgroups of type I. 

When f(A) does not have four linear factors it has at most two. Let us 
consider the case where it has just two linear factors. If they are the same, 
we may suppose that f(A) = q(A) - A’, where q(A) is an irreducible quadratic, 
All such quartics are conjugate under the projective group on A, and hence 
there is but one such group. The group G contains 1 + p subgroups of type I, 
and it contains a subgroup of order p"** with commutator subgroup of order p? 
and no operators of type I; this last subgroup corresponds to the irreducible 
quadratic. 

If the two linear factors of f(A) are distinct we have f(A) =q(A):A. 
U contains one subgroup permutable with s,; and one permutable with s,. 
G contains therefore two subgroups of type I and a subgroup of order p"? 
with commutator subgroup of order p? and no operator of type I. The number 
of distinct such groups is the number of conjugate sets of polynomials 
qg(A)(A—Ax)(A—Az) under the “rational” projective group on A. The 
irreducible quadratic may be transformed into any particular quadratic and 
there is then a group of order 2(7-+ 1) which leaves it fixed. There exists 
an operator of order two which leaves q(A) and f(A) fixed, viz., the one which 
interchanges the zeros of g(A) and also interchanges A; and A». Consequently 
there are p+ 1 quadratics (A—A;)(A—Az) such that g(A) (A—Ax) (A—A2) 
belong to the same conjugate set unless A, A2, and the zeros of q(A) constitute 
a harmonic set in which case there are (p + 1)/2 in the conjugate set. Of the 
first kind there are therefore (p —1)/2 conjugate sets and of the second kind 
one. There are in all (p+ 1)/2 groups of this kind. 

When f(A) has but one linear factor the other factor is an irreducible 
cubic. G has one subgroup of type one, and a subgroup of order p”** with 
commutator subgroup of order p* and no subgroup of type I. To determine 
the number of such groups, let us write f(A) =c(A)(A—A,), where c(A) is 
the irreducible cubic. The “rational” projective group contains an operator 
of order three which transforms a root A» of ¢(A) =0 into its p-th power.’ 
This group associates three numbers A,, As, and A, with any one of them, s0 
that the cross-ratio of Xo, Ao?, Ao”, A, equals the cross-ratio Of Av?, Ao”, Ao, Az 
equals the cross-ratio of Ao”, Ao, Ac”, As» There are therefore (p+ 5)/3 or 
(p+ 1)/3 conjugate sets of such quartics depending on whether p is of the 
form 64 + 1 or 64 — 1. 


°Cf. “On cubic congruences,” Bulletin of the American Mathematical Society, 
vol. 39 (1933), pp. 962-969. 
*°The cubic ¢c(A) appears as a quartic with one root infinite. And when 


| 
| 
| 
| 


ty, 


METABELIAN GROUPS AND PENCILS OF BILINEAR FORMS. 657 


There remain the groups G with no subgroups of type I. There are 
p+2such groups. There are thus 2p + 7 or 2p + 9 groups with commutator 
subgroups of order p* according as p is of the form 6% — 1 or 6k + 1. 

Let us now suppose that 1/3. Then however generators of {s,, 82} are 
selected, H, and H, are of orders not greater than p*. Consequently, U con- 
tains at least two subgroups of type I. Let us suppose that U contains four 
independent operators of type I. The operators of {s,,s.} permutable with 
them must constitute at least two subgroups of order p. If they constitute 
two subgroups, then two possibilities arise: (a) three of the four independent 
U’s are permutable with s, and the other with s., or (b) two U’s are per- 
mutable with s, and two with s,. In case (a) generators of G satisfy the 
relations : 


Us §1S3, U, 8184, U U; 8185, 


U 4718.0 = 8285. 


In this case the group {U;, U2, Us} contains only operators of type I, as does 
{U;,U,}. There are therefore 1+ p-+ p* such subgroups in the first, and 
1+ p in the second; only the subgroup generated by U; is in both. Hence 
G contains 1 -+ 2p-+ p? subgroups of type I. In case (b) generators of G 
satisfy the relations: 


= 883, = 8154, 


In this case {U,, U2}, {U2, Us}, and {U;, Us} each contain 1 + p subgroups 
of type I of which two, generated by U. and U; respectively, are counted 
twice each. ( therefore contains 1 + 3p subgroups of type I. 

When the four independent U’s of type I are permutable with three 
subgroups of order p of {s,, 82}, we may suppose two of them to be permutable 
with s,, one with s., and the other with s,s.-1. The group H, must be of 
order p*, otherwise U, could be selected so that it was permutable with s;. 
Generators of @ satisfy the relations: 


U,78,0, = 8:83, = 8155, 


S285, — S283. 


G contains 1 + 3p subgroups of type I. 


P= 6k + 1, (A) may be taken to be A* + a, where —a is not a cube; the group 
leaving c(X) fixed is generated by \’= pd, where p is a primitive cube root of unity. 
This leaves c(X) and c(A) -d fixed; the remaining 6k quartics are separated into 
2k sets of 3 each. We have thus the (p + 5) /3 sets. 


ire 
asf 
us 
ne, 
ic 
1ce 

| 
n+2 
als 
he 
nd 
sts 
ich 
tly 
te 
he 
nd 
ble 
ith 
ine 
is 
tor § 
h 
or 
nen 


658 H. R. BRAHANA. 


When the four independent U’s are permutable with different subgroups 
of {s,,82} we may suppose U, permutable with s,, U2 with s,s,-1, Us with 
$827, and U, with s,. Generators of @ satisfy the relations: 


U,18,0, = 8,83, 8,5,, = 815,", 
U27182U 2 = 8284, Us = 8285, = 8283. 


The groups {U,;, U4}, {U:U4, U2}, {U1"Us, Us} each contain 1 + p subgroups 
of type I, of which those generated by U,U, and U,"U, are counted twice. 
G has therefore 1 + 3p subgroups of type I. 

Of the groups just described the last contains four independent U’s 
permutable with three subgroups of {51,82}, viz., U,; permutable with s,, 
U, and U,U, permutable with s’; and permutable with 
Hence this group is simply isomorphic with the preceding one. The preceding 
one itself contains four independent U’s permutable with two subgroups of 
{8,82}, viz., U, and U, permutable with s., and U; and U,U,4 permutable 
with s’; =s,s,1. This is therefore simply isomorphic with the one which 
precedes it. 

We now suppose that U contains three independent operators of type I. 
They cannot all be permutable with the same subgroup of {s:, s2}. Suppose 
first they are permutable with two subgroups. Then generators of @ satisfy 
the relations: 


U,78,0, = 8183, = 8,84, = 8185, 


The group H, must be of order p* if {U;, Uz, Us} contains no operators of 
type I except those in {U2,U;}, and consequently the commutators in the 
first row may be taken to be s3, sy, and ss5. Now sj; is in {83, 84,85} and con- 
sequently {U,, U2, Us} does contain operators of type IJ not in {U2, U;}. Hence 
the supposition that there are not more than three independent U’s of type I 
and that they are permutable with but two subgroups of {s,, 82} leads to a 
contradiction. 

We suppose then that the three U’s are permutable with three subgroups 
of {s,,s.}. Generators of G satisfy the relations: 


U,718,0, 8,8;, 8,83, = 8,8, 
= 828x, 3 = 8284, = 828s, 


For if s; and s, were the same, generators of {U2, Us, U4} could be selected to 
give the case above. Now s; is in {s3, s4,85} and since U, may be replaced 
by any power of itself s; may be taken to be s;. Then s, may be supposed 


| 
| 
4 
| | 
| 


DS 
th 


| to 


METABELIAN GROUPS AND PENCILS OF BILINEAR FORMS. 659 


to be in the group {s3,s,}. Since Uz, may be replaced by any power of itself 
and U; likewise, we may assume that sx is 84, OF S38s. If == s3 then the 
operator U,U.U; is of type I, and U contains four independent operators of 
type I. If s,—s,, then U,U;* is of type I. The only possibility is that 
8; = 8384. The group @ exists, it contains but three subgroups of type I, and 
is therefore distinct from any of those which precede. 

When U contains just two independent operators of type I they are 
permutable with different subgroups of {s,,s2}. Two possibilities arise: 
(a) U contains 1 + p subgroups of type I, or (b) U contains two subgroups 
of type I. 

Let the two independent U’s be U; and U,. In case (a) the commutator 
subgroup of {H, U;, U4} is of order p; let it be generated by ss. The com- 
mutators arising from transformation of s, and s2 by U; may be assumed to 
be independent of s;, otherwise {U,, U;, Us} would contain operators of type I 
not in {U;,U,}. These commutators may then be taken to be ss and s, 
respectively. Then U, may be chosen so that the commutator subgroup of 
{H, U2} is in {83,84}. If {U:, U2} contains no operator of type I, as must 
be the case, generators of G satisfy the relations: 


0,7183,0, = 8,8;, 3 = 8185, 
= 2 = S283, = 


In case (b) we may assume that generators of @ satisfy the relations: 


U, = S183, U.138,U. = U,718, U; = $184, 
= S$oSj, = S83, U 418.0, = 


For the commutator subgroup of {H, U,, U2} is of order at most p*, and the 
commutator subgroups of order p? arising from transformation of s, and sz 
respectively must have an operator in common which does not belong to 
{84,85}. Hach of the groups H, and H, must be of order p*. Therefore the 
commutator of U. and s, can be taken to be s;. The operator s; is in {83, $4, Ss} 
and can be assumed to be s,%s,8. If a is not zero, then {U,,U;} contains 
an operator of type I which is not a power of U;. We may assume a = 0 and 
B=1, in which case U,U,U;U, is of type I. Therefore there is no group 
satisfying these conditions. 

When / = 2, both H, and H, are of order p?. It is therefore possible to 
find generators of G which satisfy the relations: 


= 8,83, = 8,84, 


0,718.0; = S83, = SoS4. 


Hence there is but one group for / = 2. 


ps | 
| 
So, | 
» 
| 
of 
le 
ch 
| 
of 
he 
on- 
nce 
el § 
0 a 
ps 
sed 


H. R. BRAHANA. 


We proceed to list the groups with enough information to determine the 
group in each case. The first column contains the value of J, which is all the 
information necessary for the first group and the last. The second column 
gives the number of subgroups of type I in U. The third column contains 
whatever further information may be necessary. This additional information 
in most cases takes the form of a statement of the existence or the non-existence 
in G of a “subgroup of order p"** with commutator subgroup of order p* and 
no operator of type 1”; for such a subgroup we shall use the symbol Ga,g. 


Since it cannot lead to any confusion we shall use the symbol G,, to stand 
for a subgroup of type I. 


l G,1’8 other facts l other facts 
23. 0 a Gss, NO Ge, | 
1 24. 0 nO Goo, or | 
6. 1 a Gs,5 28. 4 — 
g. 1 no Gs,5 There are (p + 1)/6 or (p+ 5)/6 
8. 0 a Go,» such groups. 
9. 0 a Gz, and 29. 4 1+p a Ge» 
no Ge» 30. 2 a Go,» 
10. 0 no Ge», no Gs,4 There are (p + 1)/2 such groups. 
12. 2+ p —_ There are (p + 1)/3 or (p+ 5)/3 
13. 3 a such groups. 
14. 1+ a Gos 32. 4 0 ee 
15. 1+>p no Gos There are p + 2 such groups. (Cf. 
16. 1+p above.) 
17. 2 a Gos 14497 
18. 2 no 34, 1+ 3p 
19. 1 a Go» 35. 3 eat 
20. 1 a nO Ge. 36. l+p — 
21. 1 no Goo, no Gs, 37. 2 — — 
22. 5 0 a G2: 


There are in all 2p + 36 or 2p + 38 distinct groups according as p is 
of the form 64 —1 or 6k + 1. 


** Contains a subgroup of order px+? with commutator subgroup of order p; the 
two preceding groups have no such subgroup. 


660 
| | 
| 
{ 
| 
| 
| 


is 


the 


METABELIAN GROUPS AND PENCILS OF BILINEAR FORMS. 661 


2. The bilinear forms. As was explained for the case | = 4, the com- 
mutator structure of G can be described by means of two matrices M and N. 
In the general case these are matrices of four rows and / columns. That there 
are two matrices depends on the fact that we are discussing groups G whose 
centrals are of order p"*, and therefore generators of H can be chosen so that 
all but two are in the central of G. The argument in the paper cited above 
for 14 holds for any 1. A change in the generators of the commutator 
subgroup of G, when the U’s and s, and s2 are not changed, replaces the 
columns of M by linear combinations of its columns, and replaces the columns 
of N by the same linear combinations of its columns. The effect is to replace 
M and N respectively by MB and NB where B is a non-singular square matrix 
of 1 rows. A change in the generators of U has the effect of replacing M and 
N respectively by AM and AN where A is a non-singular four-rowed square 
matrix. A change in the generators of {s;, 82} has the effect of replacing the 
matrix by the matrix (ad’, + + + dX’2)N where 
= = 8,°S24, and (ad — bc) #0. 

The matrices M and N may be interpreted as the matrices of two bilinear 
forms in four variables 7, %2, 73,7, and variables 41, -, yi, in which 
case A,M + ».N becomes the matrix of a member of the pencil of bilinear 
forms determined by the two whose matrices are M and N. The changes on 
generators of the groups then correspond to linear homogeneous non-singular 
transformations on the 2’s, the y’s, and the A’s. The problem of classification 
of the groups is then identically the problem of classification of pencils of 
bilinear forms for / = 2,- - -,8 under these transformations, for every change 
of generators of G gives a transformation of the pencil and every transforma- 
tion of the pencil, with coefficients in the modular field, gives a transformation 
on generators of the group. In the case of 1 = 4 we were able to classify the 
groups most easily by means of the theory of invariant factors of A,M + A.N. 
when M and N are not square a corresponding theory has not been developed. 
There are some obvious difficulties in the way of extending the theory for 
square matrices. 

Having classified the group for these various values of J we are able to 
write down immediately a set of normal forms for pencils of bilinear forms 
in four x’s and 1 y’s. We give these normal forms here numbered in the same 
way as the groups at the end of § 1. 


Ai + T2Y2 + + L4Ys) + L2Y6 + L3Y7 + L4Ys). 
+ L2Y2 + + L4Ys) + + L2Y6 + 

Ai + T2Y2 + T3Y3 + L4Y4) -+- Ao (21Ys + T2Y6 + + 
Ai + LoY2 + + + Ao(LiYs + 


q 

e 

e 

n 

id | 

Gos 

Gs,3 

5)/6 
oups. 

5)/3 

(Ct. 

|_| 


H. R. BRAHANA. 


Ai (2141 + L2Y2 + + do + + LsYo)- 

Ar (211 + T2Y2 + + do + + + 

Ai + L2Yo + + + L2Y5 + T3Ye + L4Ys). 

Ai (2141 + + + L4Y) + deo + 1L2Y1 + + L4Yc), 
r not a square. 

+ Loy2 + LsYs + Lays) + + + LeY2 + 


+ Ley2 + + + Ao(LiYs + Leys + + 
A + LoY2 + + L4Ys) + 


Ar + + + Ao + LsYs). 


+ + Leys) + Az(LiYs + Leys + Vas). 


Ai + + LsY3 + + + LoY1). 

Ar + L2Yo + + L4Ys) + de (21Ys + LoYs). 

Ar + + + + + Lays). 

Ar + LoY2 + Leys) + + + 

Ai + + do (2144 + + LsYs). 

Ai (2141 + + + L4Y) + Az + + LsYs). 
Ar + + Leys + Lays) + + + 


Ar + LoY2 + LsYs + LsYs) + + + 
+ Loy2 + Leys + Lays) + + + + 
Ar + + + Bys] + Leys) + + + + Leys). 


Ar + + LsYs + Las) + + + + 


Ai + L2Yo + + 

Ar + Ley2) + Ao (LsYs + Leys). 

+ T2Y2 + + Ae (L2Y2 + + 

+ LoY2 + LsYs + + + 

Ai (2141 + LoY2+ + + TL2Y1 + 

Ar + Loy2 + + Bys] + ways) + + + 
(ry + + L3Y3 + L4Ys) + re (2142 TL2Y1 + TLsY3). 


Ai + + + BY: -+- V¥2 |) 
+ do (2144 + T2Y1 + L3Y2 + LsYs), 
where A* + 8A* — yA? + BA — a has no linear factor. 


(2141 + + + 


Ai (4141 + LoY2) + + Lays). 
+ LoYo + + [y1 + Ys | + T3Y3 + sY1). 


(2141 + T2Y2 + LsYs) Ae TL2Y1 LsYs). 
+ Loy2) + + 


It is interesting to interpret in terms of the bilinear forms the considera- 


662 
5. 
6. 
8. 
| 9. 
10 
| 
14. 
15. 
16. 
17. 
18. 
19. 
20. 
| 21 
23 
26 
28 
29 
30 
31 
32 
34, 
36 


METABELIAN GROUPS AND PENCILS OF BILINEAR FORMS. 663 


tions that were used in classifying the groups. The separation of the groups 
into classes according to orders of their commutator subgroups corresponds 
to the separation of the bilinear forms into classes according to the number 
of the variables y. One of the first things which attracts attention with regard 
to the groups is that if U is of order p* the order of the commutator subgroup 
cannot be greater than p*. This says that any pencil of bilinear forms in 
four z’s and / y’s is expressible as a pencil of forms in four 2’s and I’ y’s 
where /’= 8. This is of course obvious from a consideration of the matrices 
of the forms in terms of which the pencil is expressed. 

A pair of numbers A, Az determines a bilinear form of the given pencil; 
the pair also determines an operator s,“s,"* of the group {s1, 82}. When these 
particular values \1,Az, are substituted in the expression A,M-+ AN the 
resulting matrix has four rows each determining an operator of the com- 
mutator subgroup arising from transformation of s,“s,* by U. If U contains 
an operator permutable with s,™s,2 the rank of this matrix is less than four. 
The corresponding bilinear form determined by A,, Az will therefore have a 
rank less than four. Consequently the separation of the groups with com- 
mutator subgroups of a given order into classes according to the number of 
operators of type J in U corresponds to the separation of the pencils of 
bilinear forms with four z’s and / y’s into classes according to the number 
of forms in the pencils which have ranks less than four. We may call such a 
form, of rank less than four, singular. Then the classification of pencils has 
been made according to the number of singular forms in a given pencil. 
Theorem (1.1) states that any two pencils of bilinear forms in m 2’s and 
2m y's are conjugate and that such a pencil contains no singular form. 
Theorem (1.2) states that there are two distinct pencils of bilinear forms in 
m x's and 2m—1 y's; one contains no singular form, and the other contains 
one,’ 

When / = 6 the classification according to the number of singular forms 
in the pencil is not enough. There are two pencils which have each one 
singular form. In both 6 and 7 above the singular form appears for 
Ai, A2 = 1,0. We may distinguish between them in the following way: Con- 
sider the numbers (2, 2, Zs, 24) to be the codrdinates of a point in a finite 
three-space. In terms of the codrdinates of the plane x, 0 no. 6 determines 
a pencil of forms in three 2’s and five y’s and no. 7 determines a pencil of 
forms in three z’s and six y’s. The first pencil contains no singular form. 
It is possible to select planes in (2, %2, %3,%4) in terms of whose codrdinates 


“? Where we understand the forms F(#,,---,y,) and k.F(a,,---,y,) to be the 
same; otherwise the number is p— 1. 


5). 

3) 


664 H. R. BRAHANA. 


no. 7 will give pencils of forms in three z’s and five y’s, but every such pencil 
will contain singular forms. 

There are three distinct pencils of forms in four z’s and six y’s none of 
which contains a singular form. None of the pencils of forms in the variables 
of a subspace of (21, %2,%3,2%,) can be singular. No. 8 determines a pencil 
of forms in two z’s and two y’s on the line z, =a, —0. There is no line 
on which nos. 9 or 10 determines such a pencil. No. 9 determines a pencil 
of forms on three z’s and four y’s on the plane z, = 0, and there is no plane 
on which no. 10 determines such a pencil. 

The interpretation in terms of bilinear forms is particularly enlightening 
in the case of no. 28. There are (p+1)/6 or (p+ 5)/6 such pencils 
depending on the value of p. Each pencil contains four singular forms, each 
singular form is given by a pair A;,A2. Hach form of the pencil determines 
a point on the line (A;,A2). The cross-ratio of these four points remains 
invariant under projective transformation of the ordered set of four points. 
A reordering of the four points gives one of six values of the cross-ratio. 
Two forms determined by p and p’ cannot be conjugate unless p’ is one of the 
values p, 1—p, 1/p, 1/(1—p), (p—1)/p, or p/(p—1). 

The differences among the groups that come under 30, 31, and 32 can 
all be interpreted in terms of pencils induced in subspaces of (2, 12, Ys, 4). 
It is suggested that perhaps a more thoroughgoing geometric interpretation 
of the whole situation would be worth while. 


3. The proof of distinctness. We come now to the question of the 
possible isomorphism of two groups belonging to different classes. Our proofs 
of uniqueness of the various normal forms of generating relations have always 
been proofs of uniqueness under automorphisms of G in which H corresponds 
to itself. In none of these groups has H been the only abelian group of 
order p" in G, for the group {U,, U2, 83, 84, - -, Sn} is such a group. More- 
over, the group U is not in general a characteristic subgroup of G, for any 
operator U; may be replaced by sU; where s is any operator of H. If s is 
in the central of G this has no effect on generating relations, but if s is not 
in the central the U’s will in general cease to be permutable. We are con- 
fining our attention to groups G in which the U’s are permutable. This is 
justified by the fact that any classification of metabelian groups must take 
these groups into account; it must depend on the possibilities of “ commutator 
structure ” arising from transformation of H by U. The simplicity of the 
statement in terms of pencils of bilinear forms gives added assurance that 
the limitations imposed on the investigations are natural to the problem and 
not arbitrary. 


METABELIAN GROUPS AND PENCILS OF BILINEAR FORMS. 665 


The question may be considered from the point of view of the uniqueness 
of the defining relations. The particular defining relations that we have 
chosen in each case are not the only ones that could be chosen. The situation 
is rather that if certain properties of the generators are required, then the 
defining relations can be reduced to the normal form we have given for the 
particular group. If two groups G and G” belonging to different classes were 
simply isomorphic, then it would be possible to select generators of G’ which 
satisfied defining relations of G. These generators would in every case satisfy 
the following conditions: 


(1) n of the generators would generate an abelian subgroup H’, and 
and n — 2 of them would generate the central of G’; 

(2) the remaining four would generate an abelian group of type 1, 1, 1,1; 

(3) no operator of the second group, except identity, would be permutable 
with every operator of the first ; 

(4) the operators of the second group would correspond to operators of 
types I and II in the I-group of the first. 


We have already considered the possibility of such isomorphisms between 
G and G’ in which H corresponds to itself and have seen that none exists when 
the normal forms of the generating relations are different. The group H’ 
cannot then be {5;,8,- **,8n}; it must, however, include the central 
{83, 84," Sn}. We assume then that and @ have different normal forms 
for their generating relations and that there exists a simple isomorphism 
between them. Then there exists in G a group H’ which corresponds to H 


in G’. This group H’ must be 83, S4,° * Sn}, Where 
The numbers k,,- - -, m2 cannot all be zero for then H’ would be H. We may 


suppose that all but /, and 7, or all but &, and kz are zero, for this requires 
ouly a proper choice of generators of U. We may suppose further that the 
two which are not zero are both equal to 1, so that s’; —s,%sU, and 
= (i = 1 or 2). If we may select generators 
of {8,, 82} so that x’; —s,U, and s’,—s,U;. If then i—1, we may replace 
by 8.” = If i= 2, the group U’ = {U’,, U’2, U’;, where 


U’; — U4" 


must contain U, or s, and U. or s.. U’ cannot contain both s, and s, 
for then without changing generating relations we could replace H’ by 


| 


666 H. BR. BRAHANA. 


H” = {U,, U2, 83, 84,* * *,8n} Which is not maximal abelian invariant. We 
may therefore suppose that x’; —=s,U, and s’,==s,. If on the other hand 
— = 0, we may suppose that s’,; —s,U, and *,—U;. If Ui =0,, 
s, is in H’ and we may take s’; to be s,. If U; is not U,, then either U;, or 
s, is in U’. We have seen iuat there are not two independent U’s in H’ and 
therefore whatever the value of «8. — %28; we may suppose that 


8’; and 8's == So. 


From this it follows that U, must be permutable with s. and hence must be 
of type I. In the expression for one of the U’;’s as given above in terms of 
81, 82, U,,- the exponent must be different from zero. Making use 
of it, s’, may be replaced by s,” =U, without affecting the generating rela- 
tions. The group U’ must then contain s, or else by the same sort of trans- 
formation H’ can be changed to H. But if s, is in U’ it must be permutable 
with U., U;, and U,. This identifies the group as no. 25. Since this is the 
only one of the groups with 2 +- p + p* subgroups of type I, the original sets 
of relations of generators of G and G’ were transformable into each other. 


4. The general case. The theorems (1.1), (1.2), and (1.4) go beyond 
the case where the order of U is p*. It is clear that the methods used will 
suffice to classify the groups G of order p"*™ where U of order p™ and abelian 
of type 1,1,- - - contains only operators of types I and II. It is clear also 
that this problem is the same as the problem of classification of families of 
forms. The groups of order p"*” may be separated first into classes according 
to the orders of their commutator subgroups. Then each of these classes may 
be subdivided according to the number of independent operators of type I 
in U. When these U’s of type I are segregated, the remaining operators of a 
set of independent generators of U determine a group U’ whose operators are 
all of type II. U’ determines two groups H’, and H’, which are commutator 
subgroups arising from transformation of s, and s, respectively by U’. Unless 
the commutator subgroup of {H, U’} is the product of the two distinct groups 
H’, and H’,, H’, and H’, have a cross-cut different from identity. This 
cross-cut determines operators U,, U2,- - - such that their commutators with 
8, generate the cross-cut and it determines operators V,, V2,- - - such that 
their commutators with s, generate the cross-cut. The order of the group 
U” == {U,, -, Vi, enables us to determine a normal form for 
the relations defining {H, U’} and then a normal form for relations defining @. 
The kinds of differences that may present themselves are apparent. For 
example, the operators U;,U2,- - -, V1, V2,- - - may be independent or they 


1) 


METABELIAN GROUPS AND PENCILS OF BILINEAR FORMS. 667 


may be dependent in various ways. It may be possible to select « U’s and « V’s 
such that the commutator subgroup determined by them and H is of order p*, 
in which case G contains a subgroup Ge. If that is the case it is necessary 
to determine the type of Ga,q which appears. This goes back to the question 
of the invariant factors of the matrix 4,M + ».N where M and N are a-rowed 
square matrices. Though the method is clear and obviously sufficient it would 
be desirable to continue the study on account of the interesting facts that are 
bound to appear in the classification of polynomials of degree even as small 
as five. 

It is likewise obvious that the methods used here are sufficient for the 
classification of groups where U contains operators of type more “ advanced ” 
than I and II. If U contains an operator of type III, one which determines 
the partition n= and no operator except those 
of types I, II, and III, then the central of @ would be of order p™*. Only 
three of the generators of H, s,, s2, and ss, need be outside the central of G. 
They would determine three matrices M,, Mz, M, and three bilinear forms. 
The classes of groups would correspond in a 1 —1 manner with the classes of 
three-parameter (homogeneous) families of bilinear forms A,;M, + AsMz2 + AsMs. 
This classification would probably depend on the types of pencil as well as 
the types of form contained in the family. The method of procedure is clear 
in its general aspects; the details of the possible difficulties are not so clear. 
On that account the classification should be carried on in detail somewhat 
further in this direction. 

The methods and results point the way also to the treatment of groups 
where U contains operators of type K and none of type greater than K. We 
shall content ourselves with the statement of the following theorem which has 
been clear for some time, although the theorem does not take full account of 
the method of attack we have used. 


The problem of the classtfication of metabelian groups {H, U} of order 
p"™ which contain H as a maximal invariant abelian group and in which U 
is in the group of isomorphisms of H is identical with the problem of classi- 
fication of k-parameter (homogeneous) families of bilinear forms in m 
variables x and an undetermined number of variables y under “ rational” 
projectwe transformation on the a’s, the y’s, and on the parameters. The 
number k takes on all values not greater than n/2. 


UNIVERSITY OF ILLINOIS. 


e 
d 
d 
f 
e 


A METRICALLY TRANSITIVE GROUP DEFINED BY THE 
MODULAR GROUP. 


By Gustav A. HEDLUND.’ 


1. Introduction. It is known ? that the modular group, I, is metrically 
transitive with respect to the real axis. This means that if H is a measurable 
point set of the real axis which is invariant under the transformations of the 


modular group, I, either # or its complement with respect to the real axis 


is a zero set. 
Let T. be the group of real transformations of the (€,7) plane, z, into 


itself, given by 


é= ag y= ay’ + 6 

cf’ +d’ + 4’ 
where a, b, c and d are integers such that ad —be —1. The object of this 
paper is to prove that the group T, is metrically transitive with respect to 
the plane =. 

This is the essential result needed in proving the metrical transitivity 
of the dynamical system obtained by considering the non-euclidean billiard 
problem * defined by the modular group. The similar result where the modular 
group is replaced by a certain Fuchsian group with closed fundamental region 
has been published by the author.* The technic in the two cases is similar, 
but in the present case the method is not buried under the details involved 
in the other case. 

The result obtained here implies the metrical transitivity of the group I 
with respect to the real axis, but the proof is, of course, very indirect. Con- 
versely, as an example shows, metrical transitivity of a group G with respect 
to the real axis does not necessarily imply metrical transitivity of the group @:, 
which is obtained by applying the transformations of G simultaneously to two 
variables, with respect to the plane of the two variables. 


1This paper was completed during the tenure of a National Research Fellowship. 

*M. H. Martin, “ Metrically Transitive Point Transformations,” Bulletin of the 
American Mathematical Society, vol. 40 (1934), pp. 606-612. 

*E. Artin, “Ein mechanisches System mit quasiergodischen Bahnen,” Abhand- 
lungen aus dem Mathematischen Seminar der Hamburgischen Universitat, vol. 3 (1924), 
pp. 170-175. 

“G. A. Hedlund, “On the metrical transitivity of the geodesics on closed surfaces 
of constant negative curvature,” Annals of Mathematics, vol. 35 (1934), pp. 787-808. 


668 


29 


A METRICALLY TRANSITIVE GROUP DEFINED BY THE MODULAR GROUP. 669 


2. Quasi-transitive points. Under the transformations of the group I, 
a point P of z is transformed into the set of points congruent to P. If this 
set is everywhere dense in x the point P will be called a quasi-transitive point. 
From the work of Artin ® it is known that not only are there quasi-transitive 
points, but a point is quasi-transitive if either its abscissa or ordinate belongs 
to a certain linear set, the complement of which with respect to the real axis 
is a linear zero set. It immediately follows that the set of non-quasi-transitive 
points of + constitutes a zero set. 

With the aid of this result the problem of proving that any invariant 
measurable set in is either a zero set or the complement of a zero set is 
considerably simplified. For since the non-quasi-transitive points form an 
invariant zero set, these points can be omitted from a measurable invariant 
set without affecting either the measure or the invariance. The use of this 
fact is illustrated in the following lemma. 


LemMA 2.1. Jf E is an invariant measurable set of 7 and D is an open 
set of x, E is a zero set if H: D is a zero set. 


For assuming all points of H quasi-transitive, the set H can be obtained 
from the set - D by applying the transformations of the group T.. But 
if E-D is a zero set, the set obtained by applying the denumerable set of 
transformations of T, is a zero set. 

In particular the set D will be chosen as the set 1 < §€<2,—1l<7< 0. 
If # is the given invariant measurable set and it is shown that either #- D 
or D— E- D is a zero set, the desired theorem will have been proved. 


3. Anetin D. Let P(é,7) be a point of D and let the developments 
of € and —» in continued fractions with positive integral partial quotients 
be given by 


S 


where [bo, b,, b.,- is given by 


[ bo, b,, = bo 


and the continued fraction may or may not be terminating. If it is termi- 
nating, the representation will not be unique, but this does not affect the 
following arguments. 


SE. Artin, loc. cit., p. 174. 
° Perron, Die Lehre von den Kettenbriichen, Teubner, 1913, p. 27. 


y 
le 
e 
ig 
0 
| 
0 
y 
d 
n 
d 
t 

] 
+ 

b, + ete., 
| 


670 GUSTAV A. HEDLUND. 
Given the two sets of positive integers, @;,d2,°-*, dp, and -, dy, 
let be those points of D such that [1, a, +, -] and 
—7=[0, a1, 42,°**,@,-*-*], where again these may be terminating 
continued fractions but they must contain enough partial quotients to assure 
the presence of the given sequences. The set R includes all the points of a 
rectangle in D. The interior, A, of #, will be denoted by 


{1,a1,° Qn; 0,04,° *, Av}. 


Thus, in particular, D is given by {1; 0}. 

Let L(@;,@2,° - *,@:) be the length of a side of A parallel to the z-axis 
and L(a_,,4-2,° - -,a_v) the length of a vertical side. The following formula 
is readily obtained: 


(3. 1) =I] (PiP%)*, 
i=1 
where P; = and = [ai,- +1], 
A similar formula holds for L(a4,-- -, av). 
4. A lemma on continued fractions. 


4. 1. 
[%o, %2,° In] 


< 


F (2, 21; ° = 


If n is even (odd), F ts a non-decreasing (non-increasing) function of each 
of the variables (t= 0,1,---,n). 


Let = [2i,- +, an] and Qi =([2i,° +, a]. Then obviously, 
OP, /dxo = 0Q./0x, = 1, and the following formulas can be obtained by evalu- 
ating the limit of the difference quotient: 


4 a 
(— 1) (— 1) - -, 9). 
02; k=1 k=1 


From these formulas follow: 


Case I, n even. Then P; = Qj, 7 even, and P; = Qi, 1 odd. In this case 
it follows at once from (4.3) that 0F/éz,=0. From (4.3) we have 


} Px’, 
k=1 


cig 


A METRICALLY TRANSITIVE GROUP DEFINED BY THE MODULAR GROUP. 671 
But 
PoP; 
LQ + 


since P; = Q,, and hence 0F'/dz, = 0. To complete the proof by induction, 
we assume the theorem true for (1 =1,2,:--,7< mn). Then if 7 is even we 


t, 


wish to show that P)Qo* I P2Qx? 21. From the assumption that the result 


holds for i < j + 1, it follows that PoQo il P.2Qx? 21. But the inequality 


Pj. = implies that = jn +1)/(2jQin +1) 21, 
and the desired result follows. The case where j is odd is treated similarly. 


Case II, n odd. Then P; = Qi, 7 even, Pi; = Qi, 1 odd, and from the 
reversal of the inequalities, the proof in this case follows readily from that 
given in Case I. 


5. The fundamental lemma. 


LemMMA 5.1. There exist positive constants k, and kz such that 


1 
L(a@,, 
< kz Lia’ > 0, 
(a 1> 
independent of the positive integers a’,,° +, @’m,Q1,° * * On 
Proof. Let 

P, = an]; +1], 


Ry = dm] | RY, = 4m +1], 
= am]; = Um +1), (te m). 
Then 
4=1 


i i=m+1 
— TI 
i=1 


14 


d 
ng 

a 
la 
). 
h 
| 

m n 
| 
i=1 
|_| 


GUSTAV A. HEDLUND. 


L (a, Om) L * * An) jaa 8’; 
Let us consider 
[ai,-- am] 


From Lemma 4.1, if m —1 is even, 


where #,(1) is the number obtained by replacing each argument in R; by 1. 
Similarly, if m is odd 


Hence 


Rm-2(1) )i PSS; 


where the products in the parentheses are continued as long as the subscripts 
remain positive. From this follows 


[1] [1,1,1] ) 
[1, 1] [1, A, 1,2} i=1 


provided the infinite products in the parentheses converge. But the successive 
convergents of (1 + 5%) /2 are given by c, = [1], co =[1,1],- - -, and it is 


fo @ 
readily shown that the infinite products J] ¢2:-1/co; and J] ¢2i:/c2i., converge 
i=1 4=1 


to positive constants e, and é2, respectively. 
Using the same technic it can be shown that the desired inequality 


= GS e,?/e,? 
obtains. Setting = e,?/e,” and k, = e,?/e,2 = the desired lemma.holds. 


6. Some inequalities. Let A= {1,a,,-- -,ay; 0,a1,@-2,° be 


a chosen rectangle of the net in D. Let o, denote the set of sub-rectangles 


672 
and 
AiG) 
Ri-+(1) 


A METRICALLY TRANSITIVE GROUP DEFINED BY THE MODULAR GROUP. 673 


of A given by {1, Gp, 81, S2,° *,8), 1; 0, @4, G2, +, where 
81, 8, are arbitrary positive integers, but » is so chosen that A+ is 
odd. The area of the set o, is denoted by p(y). 


LemMMA 6.1. There exists a positive constant kz such that 
> 


This amounts simply to proving that there exists a positive constant k; 
such that 


where k, is independent of 4, 


A brief computation with the aid of (3.1) yields 


L(a, * *°,Qm, 1 ) 
L(a,, Ae, Om) 
Om | [ a2, Om | am 1 


With the aid of Lemma 4. 1 and using the notation of the preceding paragraph, 


L (ay, * *54m,1) 1 1 1 
**,@m) ~ Cy Gy Gy Cy 1+ 5% 


and the above lemma holds with k; = (1 + 5%). 
Using the fact that A + yp is odd, the rectangle 


A =m {1, 01, * 815° * 15 0,1, 0-2,° av} 
is defined by the inequalities: 


[1,@,° » $1, ° 1] [1,@1,°° » Mp, $1, ° 2] £2, 


provided y is odd. The second inequality is reversed if v is even, but it will 
be sufficient to discuss the case v odd. 


The transformation = [1, Qu, is given by 
‘+b 
(6.1) 


where a, b, c and d are positive integers and hence (6.1) is a member of the 
group Tr. If we let 


, 
10; *,81, 4p," a, 1, @ay* a4] 


i 
§ 


674 GUSTAV A. HEDLUND. 


and 


then it is easily shown (this result is stated in Artin, loc. cit., p. 173) that 
the relations, 


_ +b +b 
1 on’ +d’ He +d 


hold, where a, b, c and d are given in (6.1). Thus it is seen that there is a 
transformation of the group T, which transforms the rectangle 


A= {1,4,,°° $1," *, 8,15 0,04, ar} 
into the rectangle 
A’ = {1; 0, 8, "$1, Mp," 1, G4,° Av}. 


Using only transformations of the group T., each of the non-overlapping 
rectangles of o), for fixed A, can be transformed into one of the non-overlapping 
set {1; 81, 41, such that the correspondence 
is one-to-one. 


LemMA 6.2. There exists a positive constant k, such that 
> 


Considering a single pair, A and A’, of corresponding rectangles of the 
sets and o’), 
A brief computation yields the equalities 
p(A’) | — | 
= | + d) + d) (ey: + d) + d)| = 


— (£2 — 72) 

| (€: —m) (2 — 72), 
From the inequalities 1 = é,, &, @2 S 2, —1 Sm, 71, S 0, it follows 
from the last equation that 


p(A") 
But this inequality holds for each of the corresponding pairs in o and or 
and the desired lemma holds with k, = 1/9. 


IIV 


A METRICALLY TRANSITIVE GROUP DEFINED BY THE MODULAR GROUP. 675 


Lemma 6.3. If E is a measurable point set of x which is invariant under 


the group then 
Eon) > 0’). 


Denoting, as above, by A and A’ corresponding rectangles of the sets 
o, and o’) respectively, since # is an invariant set, the set E - A is transformed 
into the set E- A’ by the transformation of the group I, taking A into d’. 
Using the equation 
| (€: — m) (€2 — 72) | 
| — 91) — 72) | 


there is obtained as above 


= kyp(d’). 


But if A is any sub-rectangle, with sides parallel to the axes, of A, and 
A’ is the corresponding rectangle under 7’, precisely the same inequality can 
be obtained, viz., 

= kup (Q’). 


VS US 


The rectangle A being an arbitrary sub-rectangle (with sides parallel to the 
axes) of A, it follows that the Jacobian of 7’, evaluated at any point of A, lies 
between 9 and 1/9. This implies the inequality »(L- A) = kyp(E- A’). By 
summing over the set o), the stated lemma is obtained. 


Lemma 6.4. EH being a measurable invariant set of w, there exists a 
positive constant, ks, which does not depend on how A is chosen in D, and 
is such that if » is chosen sufficiently large, 


Case I. E- D = {1, €2,° @m3 0, €-1, €-2,° 
Let A be so chosen that A > n. Then 


where > indicates the sum of the lengths of the intervals for which 
8 
Si\n,* * *,8, are arbitrary positive integers, but the other elements are fixed. 


Similarly, 


& 


where the sum is extended over all positive integral values of s),° - -, 5). 


t 

| 


GUSTAV A. HEDLUND. 


From Lemma 5. 1, 
L (8), ,81,Mp,° L(A, * *, Ay) 
<k 

* *, $1) * 4) 
where in each term where 1,- - -,1 occurs, there are A such elements. It 
follows from this that 


58). 
Evidently 

= L (8), = 1, 


and 


With the aid of these 


Choosing k; = k,/k2, the lemma holds in this case. 


Case I]. E- Disa finite set of non-overlapping rectangles of the net in D. 


N 
Let H-D=>R;. If d is chosen sufficiently large the proof given in 


i=1 


Case I holds simultaneously for all of the set Ri, (1 =1,- - -, WN), and hence 


N N 
=} i=1 


The lemma holds again with k; = k,k,-. 
Case III. E-D is an infinite set of non-overlapping rectangles of the 
net in D; E-D=> Rj. 
i=1 


N 
Given «, there exists an N such that »p( ©} Ri) > (1—e)p(L-D). For 
ia 


N 
any A, 0’) = oy) = (Rs -o’y). If A is sufficiently large, 


Case II can be applied and 
N N 
oy) Su ( Ri) > bike (1—e) D) p (o'r): 
i=1 i=1 


If « < 1/3, the lemma holds with k; = 2k,/3k.. 


676 


It 


le 


A METRICALLY TRANSITIVE GROUP DEFINED BY THE MODULAR GROUP. 677 


Case IV. E~-D is an open set. This case is already included under III, 
for an open set is the sum of an infinite set of non-overlapping rectangles of 
the net in D, together with the boundaries of these rectangles. Since the 
boundaries form a zero set, they do not affect the argument. 


Case V. E-D is a measurable set of positive measure. 
Given *,Qp,@1,° *,a-v, from Lemma 5. 1, 
, . . . . . . . 


8 


and hence p(o’,) is bounded away from zero, for arbitrary A. Let c > 0 be 
such a lower bound. 

Given « =k,k.'3"'cu(H: D), there exists an open set Hy such that 
E-D and —E-D) <«. For d sufficiently large, we have from 
Case IV, 


> 2h, ( Lo) = 2h, p( D)p(o’y). 
But 
and hence 
= 2k, 3k. D)p(o’,) —e. 
From the choice of «, it follows that 
The desired lemma holds with k; = k,3-1k,71. 


7. Metrical transitivity. 


THEOREM 7.1. (Metrical transitivity). If E is a measurable set of x 
which is invariant under the group either p( =0 or p[Cr(E)] = 0. 


It can be assumed that »p(#) >0. From Lemma 2.1, > 0. 
Let A be a rectangle of the net in D. From Lemmas 6. 1-6. 4, if A is chosen 
sufficiently large, 


A) 0’) > 0’y) = sp - D) 
> > D)p(A), 
A) > ku(A), 


where k > 0. 


n 


GUSTAV A. HEDLUND. 


But this implies p(Z-D) —p(D). For if this were not the case, there 
would be a point of D at which the metrical density of the set EH - D would be 
zero. If P is such a point, a sufficiently small square, 8, with P as center lies 
entirely in D and p(E-S) < kp(S). A sequence of non-overlapping rect- 


angles of the net in D can be so chosen that S = F + > R,, where F is a zero 


set. For each of these rectangles (7.2) holds and hienee 


»(E-8) — By) > b> Re) — ky(8). 


This contradiction implies p(#-D) = p(D). 
The set Cr(/) is then a measurable invariant set such that »(D-CrzE£)= 
From Lemma 2.1, »(CrH) =0, and Theorem 7.1 holds. 


Theorem 7.1 implies the metrical transitivity of the group with respect 
to the real axis. For if this were not true there would exist a measurable 
non-zero set, H,, of the real axis such that H, would be invariant under the 
group T' and C(£,), with respect to the real axis, would not be a linear zero set. 
But then the set F;, consisting of those points (é,7) of such that both 
€ and 7 belong to H,, would be a measurable non-zero set of 7, invariant under 
T., such that neither X =0 nor X =0. This contra- 
dicts Theorem 7.1, and hence the group [ must be metrically transitive with 


respect to the real axis. 
Conversely, the group’ G generated by the transformations 


TS; é=—2+ aj, (j—1,2,--°), 


for which lim a; = 0, n— o, a; ~€0, is metrically transitive with respect to 
the real axis, but the group G, is evidently not metrically transitive with 
respect to the plane. 


Bryn Mawr COoLtece, 
Bryn Mawkg, Pa. 


7 Martin, loc. cit., p. 611. 


678 
| 
| 


to 


SOME INTRINSIC AND DERIVED VECTORS IN A KAWAGUCHI 
SPACE. 


By J. L. SynGe. 


1. Introduction. Let there be a space of N dimensions with codrdinates 
z‘, and let there be a function F of 


where ¢ is a parameter, apd 


We define F to be an invariant in the sense of tensor calculus. This does not 
imply the invariability of the functional form of F but simply that, when we 
employ a different codrdinate system Z‘, we are to associate with it a func- 
tion F, given by the transformation of the arguments of F, ¢ being unchanged : 
that is, 


The length of a curve from to =f, is defined to be 
the invariant 
(1.4) — "rat. 

ty 

Following Craig,’ we shall call such a space a Kawaguchi space of order m, 
although Kawaguchi did not include ¢ among the arguments of F.? If 
m=1 in (1.1), if ¢ is absent, and if F is homogeneous of the first degree 
in «4, the Kawaguchi space reduces to a Finsler space. If, more particu- 
larly, F is the square root of a homogeneous quadratic expression in x4, 
the Finsler space reduces to a Riemannian space. It will be noticed that we 
have made no reference to transformation of the parameter ¢. In both the 
Finsler space and the Riemannian space, the are s as given by (1.4) is 
independent of the particular parameter ¢t employed. In the Kawaguchi space, 
as discussed in the present paper, no condition is imposed on F' to insure 
invariance of s under transformation of the parameter ¢. With the exception 


7H. V. Craig, “On a generalized tangent vector,” American Journal of Mathe- 
matics, vol. 57 (1935), p. 457. 

* A. Kawaguchi, “ Die Differentialgeometrie in der verallgemeinerten Mannigfaltig- 
keit,” Rendiconti Circolo Matematico di Palermo, vol. 56 (1932), pp. 245-276. 


679 


be 

ct- 

ro 

et 

he 

et, 

th 

er 

a- 

th 

h 

| 


680 J. L. BYNGE. 


of (5.30), the results established will be true if this invariance exists, but 
they do not require it.° 

Since the parallel displacement of a vector, together with the associated 
ideas of absolute derivative and covariant derivative, play a fundamental part 
in Riemannian geometry, it is natural in studying these more general types 
of geometry to attempt to define parallel propagation and the associated opera- 
tions in a way which reduces to the familiar definition when the space reduces 
to Riemannian space. The operation of absolute differentiation of a contra- 
variant vector defined along a curve has been defined in the Finsler space by 
Taylor and Synge * and in the Kawaguchi space of the second order by Craig, 
under the restriction stated. As I understand the work of Kawaguchi,” he 
appears to be interested in the most general forms which the operations in 
question could have, rather than the explicit development of the operations in 
terms of the function F. It is with this last development that the present 
paper is concerned. 

Before proceeding to the discussion of absolute differentiation and 
parallel propagation along a curve, it is natural to develop the purely in- 
trinsic properties of the curve itself. The vector which undergoes parallel 
propagation (unless e.g. it is a tangent vector) is not to be regarded as 
intrinsic. 

For the Kawaguchi space of order m I develop a set of m + 1 intrinsic 
covariant vectors associated with a curve.’ When the space is Riemannian, 
m = 1, and there are just two of these vectors, corresponding to the tangent 


and first normal. In addition to these vectors, denoted by Ei, other intrinsic 
vectors are also developed. The mode of development is based on taking some 
invariant “generating function” H of the variables (1.1), or a larger set 
containing derivatives of higher orders with respect to the codrdinates. 
Passing on to the definition of the absolute derivative of a vector along 
a given curve, it appears that the most natural process is that which derives 


*In developing the theory of the Kawaguchi space for m = 2, H. V. Craig (“On 
parallel displacement in a non-Finsler space,’ Transactions of the American Mathe- 
matical Society, vol. 33 (1931), p. 129) subjects F to the condition that s shall be 
invariant under transformation of t. Craig’s method has been extended to general 
values of m by H. Hombu, “On a non-Finsler metric space,” Téhoku Mathematical 
Journal, vol. 37 (1933), pp. 190-198. 

‘J. H. Taylor, “A generalization of Levi-Civita’s parallelism and the Frenet 
formulas,” Transactions of the American Mathematical Society, vol. 27 (1925), pp- 
246-264; J. L. Synge, “A generalization of the Riemannian line-element,” ibid., pp- 
61-67. These papers were written simultaneously and independently. 

° Three of these vectors have been given by Craig: see ref. 1. 


but 


ated 
part 
1ceg 
tra- 


SOME INTRINSIC AND DERIVED VECTORS IN A KAWAGUCHI SPACE. 681 


a covariant vector from a contravariant vector given along the curve. We are 
naturally most interested in formulae which involve only the first derivatives 
of the components of the given vector. We find that there are m+ 1 such 
derived vectors for a Kawaguchi space of order m. One of these derived 
vectors may be picked out as the natural generalisation of the ordinary ab- 
solute derivative (whose vanishing implies parallel propagation), but it is 
interesting to note that the case m = 1 is rather peculiar as far as the general 
formula is concerned. 


2. The Eulerian vector E;. In connecting with the variational equation 


te 
(2.1) 8 ( Fat 


ty 
there are associated the well-known Eulerian equations. It is natural to 
expect that the expressions which are equated to zero in these equations are 
the components of a covariant vector. That is in fact the case, and it seems 
most natural to use a variational method to prove it. Craig‘ has established 
it by a direct method. 
Let us take a singly infinite family of curves, 


(2. 2) zt == t), 


where ¢ is a variable parameter along each curve and w is constant along 


each curve. Then 


dc te OF 
(2. 3) “Fat — 


and this is an invariant. iaieihiia Craig, we shall adopt the convenient 
notation 


(2. 4) oF OF 


In the present connection d/dt means partial differentiation with respect to t, 
u being held fixed. Then 


OF m 0 
2.5 — = 
0 
dm a) @ 


Proceeding by successive steps in this way, we get 


OF 0 


du p=1 q=p 


|| 
by 
he 
in 
in 
ent 
ind 
in- 
lel 
as 
sic 
Nn, 
nt 
e 
set 
es 
n 
be 
al 
al 
t 


682 J. L. SYNGE. 


where 


(2.7) (—1)9 Figs ; 


q= 


this is the Eulerian expression. We have then 


(p-1) t=te Oxt 


d ts (aq- 


Now let us suppose that the curves (2.2) are so chosen that the points 
for which t —?¢, and tt, are common to them all, and further that the 
values of 


gs, 


° (m-1) 
am 4 


These conditions imply that 


are also common to them all at these points. 


(2.9) for t= and —1?,, (q=0,1,---,m—1). 


Then the first term on the right-hand side of (2.8) vanishes, and we have 


d 
(2. 10) — "By at 


Since this is an invariant, we have, on changing to new codrdinates 7‘, 


ts dat dai 
(2. 11) (2.5 — 


or 


te a 
(2. 12) dt =0. 


But 0z*/du is arbitrary along any one of the curves u —const., except at the f 
end points, where its components vanish; hence, by the usual method of the 
calculus of variations, 

Ox! 
(2. 13) BE, = E#; 
Hence we have the following result : 
THEOREM I. The Eulerian expressions, 


(2. 14) 


q= 0 


are the components of a covariant vector in a Kawaguchi space of order m. 


Although not intrinsic, we may mention an invariant, which appears as 


he 


SOME INTRINSIC AND DERIVED VECTORS IN A KAWAGUCHI SPACE. 683 


a by-product of the preceding work. We have seen that the last term on the 
right of (2.8) is an invariant. Hence, if the terminal variations are left free, 
instead of being restricted by (2.9), it follows that the first term on the right 
of (2.8) is an invariant also. Let us put 
(2. 15) Xt = /du, 
these being components of an arbitrary contravariant vector, given along any 
curve u= const. The invariant in question is 
(2. 16) (—1)"" 
p=l 

Writing it in a different form, we may state the following result: 

THEOREM II. The expression 
~1 


(2.17) 


(p-1) 


(XS (—1) 
p=1 


8 


is an invariant in a Kawaguchi space of order m, X* being the components of 
an arbitrary contravariant vector given along the curve zt = z(t). 


When m = 1, this invariant is 
(2. 18) PayiX'. 
When m = 2, it is 


d 
(2. 19) X*{ — dt + “di F 


3. The set of intrinsic vectors Ey. Since the establishment of the vector 
character of the Eulerian vector E;, given in (2. 14), involves nothing beyond 
the fact that F is an invariant function of the variables (1.1), it is obvious 
that we may obtain a class of vectors by the formula (2.14) on substituting 
for F any function f(/’) of it. In Riemannian geometry it is convenient to 
substitute F*. However, if the parameter ¢ is chosen so as to make F’ constant 
along the curve, which can be done by taking for parameter 


(3. 1) Som Fat, 
it is easily seen that these vectors only differ from HZ; by a constant factor. 

It is to be borne in mind in all the subsequent work that new vectors may 
be obtained from those given by writing f(F) instead of F. 

Now let ¢(¢) be any function of ¢, transforming as an invariant on 
transformation of codrdinates. Then ¢F' is an invariant function of the 


684 J. L. SYNGE. 


variables (1.1), and we may use it as a “ generating function ” instead of F, 
Substituting in (2.14), we deduce that 


(3. 2) (— as) 

q= 
are the components of a vector: we have used the fact that ¢ involves ¢ only, 
This reduces to 


(3. 3) 5 gt” , 
or 

(3. 4) 

where 

(3.5) (—1)¢ (‘) 


Here ( 2) is the usual binomial symbol, 


Now for any assigned value of ¢, we may choose the values of ¢, ¢,- - -,¢™ 
arbitrarily, and they are all invariants. Let us make them all zero except $, 


and let ¢* —1. Then the vector (3.4) reduces to H;, and we may state 
the following result: 


THEOREM III. The expressions 


m 
(3.7) B= 3 (p= 0, *,m), 


are the components of a set of m + 1 covariant vectors in a Kawaguchi space 
of order m.® 


It may be of interest to write out a few of these vectors explicitly: 


— Ey (— » 
q= 
1 m 
q=1 
m-2 
(— — (m —1) Fins + im —1) 


m-1 
= 1) F — mk (m)4 
By (—1)"Fms. 


* These vectors were obtained by Craig (ref. 1) for p=0, p=1, and p=™. 


SOME INTRINSIC AND DERIVED VECTORS IN A KAWAGUCHI SPACE. 685 


The vector character of the last is easily established directly. 
For m = 1, there are only two vectors 


0 d 
(3.9) Ey Pos — 


1 
Ey = — Fayi. 
For m = 2, there are only three vectors 


r d d? 
= Poi — Pav + 


(3. 10) 4 By —— Fay +25 


2 
a E; 


4. The intrinsic vectors G;. The quantities 74 —dzt/dt are com- 
ponents of a contravariant vector, and hence any one of the expressions 


(4.1) +,m), 


D 
is an invariant. In deriving the vector character of H; as in (3.7), all that 
was required was the fact the F is an invariant function of the variables (1.1). 
Now (4.1) is an invariant, but it is a function of 


(4. 2) t, 


since these are involved in #;. Thus we may use (4.1) as a generating 
function instead of F in (3.7), provided that we extend the range of summa- 
tion to include the variables (4.2). Hence we have the result: 


THEOREM IV. The expressions 


(p = 0, 1,- -,2m—p), 


where 
4 8 
(4.4) By 3 Fas”, 
8=p P 
are the components of a set of 4(m+1)(3m-+ 2) covariant vectors in a 
Kawaguchi space of order m. 


Perhaps the most interesting of these vectors is that for which 


(4. 5) p=m, r=m—1. 


ily. 
m) 
r) 
te 
Ce 


686 J. L. SYNGE. 


We have 


»m-1 m 
m,m 


m 
(4.6) =(— 1)™*{ (mays — 
d 


if m = 1, this becomes 


1,0 
: d 
(4. 7) Gy LVIF + dt (2PIF Gy jayi)- 


When the space is a Finsler space, F is homogeneous of degree unity in x; 


then we have 
0,1 
(4. 8) Gy = Pig — Foot, 


which is the Eulerian vector, to within a sign. We may therefore state the 


following result: 


THEOREM V. In a Kawaguchi space of order m there is associated with 
mym-1 
each point of a curve a covariant vector G; given by (4.6). When the space 


is a Finsler space, this vector is identical with the Eulerian vector, except 
for sign. 


5. The absolute derivative of a contravariant vector along a curve. Let 
X‘ be a contravariant vector field, the components being functions of the 
codrdinates only. Let us take as generating function any one of the expressions 


This is a function of the variables (4.2), and hence may be substituted for F 
in (3.7), provided that the range of summation is suitably changed, the 
p of (3.7) being changed to another letter. In fact, we get a vector con- 
structed as in (4.3). In order to show that the vectors obtained in this way 
are “ derived ” from X/, we shall adopt the notation 

(5. 2) Bux! — ot”, 


We may state this result: 


THEoREM VI. The formula (5.2) defines a set of 4(m-+1)(3m +2) 
covariant vectors derived (along a given curve) from a contravariant vector 


field X* in a Kawaguchi space of order m, B; being as given in (4.4). 


SOME INTRINSIC AND DERIVED VECTORS IN A KAWAGUCHI SPACE. 687 


If r = 0, we shall have, in the course of the calculation, to take the partial 
derivative of X’ with respect to x‘. Consequently the derived vector will 
involve the partial derivatives of the vector field, as well as its derivatives 
with respect to ¢ along the curve. But if we take r> 0, the formula will 
involve only the derivatives of X/ with respect to ¢, as in the formula for 
the absolute derivative in Riemannian space, which is 


dx* , 
(5. 8) eh 
The formula (5.2) will involve the derivatives of X/ with respect to ¢ 


up to and including the order 2m—p—vr. Let us confine our attention to 
those which involve derivatives of the first order only. We are then to take 


(5. 4) 2m —p—r=—1, r= 2m — p—1. 
Let us then consider the derived vectors 
(5.5) = Diy = (—1)?**{ — (2m — p) comps}, 
(5.6) 2m —p—1—=0, 


then 0X4/dx* will appear in the evaluation of this expression. Now (5.6) 
can be true only if m1, and then for the value p—1. It is interesting 
to see what (5.5) gives in this particular case. We have 


(5.7) Dyk 
where (cf. (3.8) ) = — thus we have 


0X5 


7 


To complete the case m = 1, we must also put p= 0: this gives 


= 


(5.9) = — (XIEj) yi + 2(XIF;) 


e the 
with 
pace 
rcept 
Let 
the 
sions 
}. 
or F 
, the 
con- 
way 
p). 
vector 
15 


688 
Here we are to put (cf. (3. 8) ) 


7 2 7 
(5. 10) = Fj; —F (1)j = F — — 


Thus 
0,1 
(5. 11) — X4(Bs) aye + cos) 


= — 2[ dt + XI{4 (Foy — 


+ + $F 01) J. 
We may state this result: 


THeorEM VII. In a Kawaguchi space" of order 1 the formulae (5.8) 
and (5.11) define two covariant vectors derived from a contravariant vector 
field (5.8) tmvolves the partial derwatives of but (5.11) involves 
only the derivatives with respect to t. 


It is interesting to see what the derived vectors (5.8) and (5.11) 
degenerate to in the case of a Finsler space. We shall not, however, use these 
formulae as they stand, but the corresponding formula with F? substituted 
for F. We shall denote the corresponding vectors by 


1,0 


0,1 
(5. 12) 
We shall write 
(5. 13) fis =4(F?) 


We know that these are covariant tensors. Using the fact that F? is homo- 


geneous of degree two in x, we obtain 


dXi 
(5. 14) = 2 { fij 
Xigok 


2 (6 4. Xi 


7In the sense of the present paper, a Kawaguchi space of order 1 is not necessarily 
a Finsler space: for the Finsler space, F is homogeneous of degree unity in #14. 


j 
fj 0X | 
jk axt if 


SOME INTRINSIC AND DERIVED VECTORS IN A KAWAGUCHI SPACE. 689 


In the still more particular case of Riemannian space, we have 
fis = 943; 


which are functions of the codrdinates only. Then we have 


dak 
whose vector character is well known. 
For the other derived vector in a Finsler space, we have 


aXi 
(5.1%) — = 


(5. 18) { fij “dt 


Ofix 
Oak Oa) 


Hence we may state this result: 


THEOREM VIII. When the Kawaguchi space of order 1 is a Finsler space 
the covariant vectors (5.8) and (5.11) derived from a contravariant vector 
field Xi degenerate (when F is replaced by F?) to (5.14) and (5.18). 


Except for the numerical factor —4, formula (5.18) expresses the 
covariant components of the absolute derivative of a contravariant vector in a 
Finsler space, as defined by Taylor and Synge.* Since the vanishing of this 
absolute derivative is taken to define parallel propagation in a Finsler space, 
it seems appropriate to adopt the following definition of parallel propagation 
in a Kawaguchi space of order 1, which, it is to be remembered, differs from 
a Finsler space only in so far as F is not necessarily homogeneous of order 
one in 


DEFINITION. A vector X/ is propagated parallelly along a curve in a 
Kawaguchi space of order 1 when it satisfies the equations 


0,1 


(5.19) 0, 
where 


ctor 
11) 
lese 


J. SYNGE. 


Let us now return to the general Kawaguchi space of order m and the 
derived vectors given in (5.5). Let us put 


(5. 21) p=m—1, 
so that the formula reads 


m-1,m m-1 m-1 
(5.22) Dy; XI (—1)™{(X4 ) — (m +1) EB; ) 


We have already investigated this for m —1 (cf. (5.9)) and we have seen 
that, if we put F? in place of F, it reduces to a familiar expression in a 
Finsler space. Let us now reduce the expression for any value of m. We have 


m-1,m m-1 d ; m-1 
(5.23) (—1)™[X9( ) — (tm + 1) {XI By ) 
Let us turn to (3.8); we have 


m-1 
(5. #4) (F — mF tuys) 
q= 


and so 


m-1 
(5.25) cme = (—1)™* jemi 


™ 
— MF (mj om-1)4 — F omy 


q=0 
(25) (marys = — (—1)™* MF (my 
Thus (5.23) reads 

(5.26) Diy Xi = — m(m + 1) 
m 

q=0 


which agrees with (5.11) when we put m—1. 

We get a derived vector analogous to (5.26) if we substitute for F any 
function of F. The earlier work indicates that F? is the most suitable func- 
tion to take. We may state the following result: 


THEOREM IX. In a Kawaguchi space of order m the formula 


m-1,m 
(5.27) XI — m(m + 1) (F*) cmyicmy; Tt 


+ {m(F?) ¢m-1)i¢m)j — (PF?) 


m 
— m? (F?) XI 
q=0 


| 
690 

| 


nd the 


any 


SOME INTRINSIC AND DERIVED VECTORS IN A KAWAGUCHI SPACE. 691 


defines a covariant vector along any curve zt = x‘(t) along which the contra- 
variant vector XJ is gwen. 


The following definitions may be set down: 


m-1,m 
DEFINITION. Ajj X/ 1s the absolute (covariant) derwative of the vector X*. 


DEFINITION. A vector X/ is propagated parallelly along a curve in a 
Kawaguchi space of order m tf its components satisfy the differential equations 


m-1,m 


(5. 28) Ai; Xi =0. 


The type of geometry with which the present paper deals is essentially 
concerned with processes of generalisation. Generalisations are by no means 
unique, and a method such as that developed in the present paper opens up 
an embarrassing variety of generalisations. Riemannian geometry is the well- 
established base with which (as a particular case) our generalisations are to 
be checked. The definitions adopted above do check with well-established 
results in Riemannian space, but we may ask whether, in the case of the 
Kawaguchi space of order m, we have been wise to use F? as the generating 
function in (5.27), instead of #”**. This would equally well give agreement 
with results in Riemannian or Finsler space, since when m1 we have 


m+1=—2, 
It is easily seen that 
(5. 29) fis = cmyicms 


is a covariant tensor in a Kawaguchi space of order m; if the determinant 
of fi; is not zero, we may introduce a conjugate contravariant tensor f‘/, and 
use it to convert covariant vectors into contravariant vectors. Thus although 
the formula (5. 27) derives a covariant vector from a contravariant vector X/, 
we can at once obtain a contravariant derived vector, 


m-1,m m-1,m 


(5. 30) A® XI Ay, 


It should be noted, however, that this cannot be done if s is invariant 
under transformation of ¢, for in that case the determinant of fi; vanishes.® 


UNIVERSITY OF TORONTO. 


*Cf. H. V. Craig, Bulletin of the American Mathematical Society, vol. 36 (1930), 
p. 560. 


in a 
» have 
| 


AN ANALYTIC CHARACTERIZATION OF SURFACES OF FINITE 
LEBESGUE AREA." PART I. 


By B. Morrey, 


Since Schwarz * showed that the ordinary definition of the length of a 
curve could not be generalized directly to give a definition of the area of a 
surface, many definitions of the area of a surface have been proposed. In 
this paper, we shall use that proposed by Lebesgue in his thesis.* Although 
this definition was almost forgotten for over twenty years due to the lack of 
methods for handling it and also perhaps for esthetic reasons, its usefulness 
in connection with the solutions of the Problem of Plateau (particularly those 
of Radé* and McShane*) demonstrates its value as a tool in Analysis and 


Geometry. 

This definition presupposes a definition of limit elements in the field of 
surfaces. For surfaces z—f(z,y), it is clear that we should say that a 
sequence of surfaces S, had the surface § as its limit if and only if the 
corresponding functions fn(z, y) converged uniformly to the limiting f(z, ). 


An ideal extension of this definition to general surfaces is furnished by 
Fréchet’s definition of the distance between two surfaces.’ It is a curious fact 
that, although earlier workers in the area of surfaces (such as Lebesgue and 
Gedcze) clearly had some such definition of convergent sequences in mind, 
it was not precisely formulated until so recently and has accordingly been used 
only in the work of Radé, McShane, Douglas, and the author. 

The problem considered in this paper is that of determining an analytic 


1 Part I was presented to the American Mathematical Society on December 27, 193%, 
under the title “An analytic criterion that a surface possess finite Lebesgue area.” 
Part II was presented on April 14, 1933, under its present title. 

2 National Research Fellow (1931-33). 

3H. A. Schwarz, Gesammelte Anhandlungen, vol. 1, p. 309. 

*H. Lebesgue, “Intégral, longueur, aire” (Dissertation), Annali di Matematica, 
ser. 3, vol. 7 (1902), pp. 231-359. 

5T. Raddé, “On the problem of least area and the problem of Plateau,” Mathe- 
matische Zeitschrift, vol. 32 (1930), pp. 763-796. 

*E. J. McShane, “ Parametrizations of saddle surfaces with application to the 
problem of Plateau,” Transactions of the American Mathematical Society, vol. 35 
(1933), pp. 716-733. 

7M. Fréchet, “Sur la distance de deux surfaces,” Annales de la Société Polonaise 
de Mathematiques, vol. 3 (1924), pp. 4-19. 


692 


| 
| 
i 
| 
i 
q 
} 
| 


NITE 


of a 
of a 

In 
ough 
of 
Iness 
those 
and 


vise 


CHARACTERIZATION OF SURFACES OF FINITE LEBESGUE AREA. 693 


characterization of surfaces of finite area more or less analogous to that of 
rectifiable curves. Accordingly, we shall list mainly researches on this problem 
and refer the reader to Radé’s Ergebnisse tract “ On the Problem of Plateau ” 8 
for the most important literature on the general theory of the area. Surfaces 
z=f(x,y) of finite Lebesgue area were first characterized by Gedcze® and 
later independently by Tonelli.*° Tonelli also characterized functions f(z, y) 
(calling them absolutely continuous) for which the area of the surface 
z=f(z,y) is given by the classical integral formula, and these functions have 
been invaluable in subsequent work on area. McShane™ and the author *” 
independently defined a class of representations (called class L) of surfaces 
for which L(S) is finite and given by the classical integral formula and 
McShane ** characterized “ saddle ” surfaces of finite area bounded by Jordan 
curves by showing that each such surface possesses a representation of class L 
(in fact “ generalized conformal”). 

The present paper gives an analytic characterization of the most general 
surface of finite Lebesgue area. It is first shown (in part I) that every 
non-degenerate (see § 1) surface of finite Lebesgue area possesses a generalized 
conformal (see § 2) representation. To characterize arbitrary surfaces, it is 
found helpful to allow parametric representations of surfaces on certain sets 
in 3-space called hemicactoids, a theory of such representations having been 
fully developed in the author’s recent paper “The topology of (path) sur- 
faces” ‘4 which will hereafter be referred to as T. It is then shown in §3 
(part II) that a necessary and sufficient condition for a surface 9 to possess 
finite Lebesgue area is that there exists a hemicactoid H on which § may be 
represented, the representation being generalized conformal on each non- 
degenerate cyclic element (see § 3). 

Throughout this paper we shall use the following vector notation: the 


®T. Radé, “On the Problem of Plateau,” Ergebnisse der Mathematik und Ihrer 
Grenzgebiete (Springer), vol. 2 (1933). 

®°Z. de Geicze, “ Die notwendigen und hinreichenden Bedingungen fiir einer end- 
lichen Flacheninhalt eines Flichenstiickes,” Mathematicai es Physikai Lapok, vol. 25 
(1916), pp. 61-81. 

*°L. Tonelli, “ Sulla quadratura delle superficie,” Atti della Reale Accademia dei 
Lincei, ser. 6, vol. 3 (1926), pp. 357-362, 445-450, 633-638, 714-719. 

74 E. J. McShane, “ Integrals over surfaces in parametric form,” Annals of Mathe- 
matics, vol. 34 (1933), pp. 815-838. 

* C. B. Morrey, Jr., “ A class of representations of manifolds (Part I),” American 
Journal of Mathematics, vol. 55 (1933), pp. 683-707 (hereafter cited as R). 

8 Loc. cit. (first reference). 

*C. B. Morrey, Jr., “The topology of (path) surfaces,” American Journal of 
Mathematics, vol. 57, no. 1 (January, 1935), pp. 17-50. 


d of 
at a 
the § 
| by 
fact 
and 
ind, 
sed 
ytic 
932, 
ea,” 
tica, 
the- 
the 
35 


694 CHARLES B. MORREY, JR. 


letters x and X shall stand for the codrdinates (z',---,2%) and (X',---, X%) 
of a point in the z-space in which the given surface lies, the letters wu and U 
for (u,v) and (U,V) respectively, the sum and difference of pairs of these 
letters, i.¢., OF Uy Us, will denote the vector sum and differences in 
the respective spaces, 2 will stand for the vector (02'/0a,-- -,0x%/da), 
a being a parameter, z(w) and X(U) will be vector functions, and if ¢ is a 
vector in any space, | ¢| shall denote its length. We shall sometimes write 
z(P) to mean z(w) where wu is the codrdinate vector of the point P. Given 
a point set #, £ shall denote its closure and H* the set of its frontier points. 
The letters r and F shall always denote Jordan regions (i. e., regions bounded 
by a single Jordan curve). All vector functions occurring in a transformation 
or a representation of a surface will be assumed to be continuous. 


1. Non-degenerate vector functions. In this section, we shall merely 
demonstrate a few simple properties of such vector functions which, however, 
are invaluable in the developments of the next section. 


Definition 1. Let x(w) be a (continuous vector) function defined on 7. 
We define the oscillation of x(u) over the set, H, as the least upper bound of 
| a(u) —a(w’)|, for all u, w’ in H. (T, def. 1, §3.) 


Definition 2. Let x(w) be defined on # and suppose C is a continuum, 
in 7, of diameter =p > 0. We define 7:(p,2;C) as the oscillation of x(u) 
over C and 7:(p, the greatest lower bound of C’) for all such C. 


Definition 3. We shall say that a continuum, C, is the upper limit of a. 
sequence, {C’,}, of continua if (T, def. 1, § 2) 


(i) all the limit points of a sequence, {Pn}, of points, Pne Cn, lie on C; 
(ii) if P is any point of C, there is a sequence, {Px}, of points, Px« Cn, 
which converges to P, {m} being a subsequence of the integers. 


If C is also the upper limit of every subsequence of {C,,}, then we say that 
C is the limit of the sequence {C,,} and that {C,} converges to C. 
The following lemma is well known: ** 


Lemma 1. If {Cy} is a sequence of continua in a closed bounded region 
R, then a subsequence of {Cn} possesses a unique limit continuum, C. Thus 
the sets (1) of all continua of R, and (2) of all continua in R of diameter = p, 
are compact. 


16 See, for instance, R. L. Moore, “ Foundations of point set theory,” American 
Mathematical Society Colloquium Publications, vol. 13, pages 28, 29. 


‘ 


CHARACTERIZATION OF SURFACES OF FINITE LEBESGUE AREA. 695 


THEOREM 1. Suppose x(u) is defined and continuous on *. Then 
m(p,t;C) is lower semicontinuous in CO, and thus takes on tts minimum, 
m(p,t), on some continuum of diameter = p. 


Proof. Let {Cn} be a sequence of continua, of diameter = p, with limit 
continuum C. Let P, and Pz be points of C such that 


m(p,2;C) =| x(P1) —2(P2)|. 


We may select a subsequence {ng} of the positive integers such that 
Pi —> P; Cn,) (t= 1, 2), and m(p, £3 Cn,) lim m(p, Cn). Then 


clearly 
=| 2(P:) —2(P2)| —‘lim | #(P,™) | 
k-00 
= lim m:(p, 7; O™), 


which proves the theorem. 


THEOREM 2. If the (continuous vector) functions t,(u), defined on fF, 
approach x(w) uniformly, 
1 (p, S lim » (p; Ty). 
Proof. For each n, there exists a continuum C;,, of diameter = p, such 
that 4:(p, Zn) = 7(p, In; Cn). We may select a subsequence, {nx}, of integers 
such that Cn,—C and m(p,%m,) >limm(p, tm). Now let and Pz be 


points on C for which | #(P,) —2(P2)| is a maximum, and let {nz} be a 
subsequence of the integers {nm} so that we can find points, Pin, on Cn,, 
80 that Pin, > Pi, (1 =1,2). Then clearly 


m(p; = | x(P;) —2(P2)| — lim | tn, (P1,n,) — In,(P2,n;) | 


S lim m:(p, = lim m(p, 
1-00 


n->0O 


which proves the theorem. 


Definition 4. A vector function is said to be non-degenerate on a con- 
tinuum, C, if it is not constant over any continuum of © containing more than 
one point (cf. T, def. 5, § 4). 

The following two theorems follow immediately from the definitions. 


THEOREM 3. If x(w) is non-degenerate on f, mi(p, 2) > 0 if p> 0. 


THEorem 4. If x(u) is non-degenerate on 7, u=u(U) is a1—1 


) 
J 
e 
n 
a 
n 

| 

n 
| 
; 
| | 

i 

| | 


696 CHARLES B. MORREY, JR. 


continuous transformation of 7 into R, and we define X(U) =a[u(U)], then 
X(U) is non-degenerate on R. 


The following theorem simplifies the argument in § 2: 


THEOREM 5. If {2,(u)} is a sequence of non-degenerate vector func- 
tions approaching the non-degenerate vector function x(u) uniformly, we can 
find a function n:(p), positive for p > 0, such that 


m(p) Sm(e,t); Sm(p, tn), (n = 1,2,-- -). 


Proof. We may define y:(p) as the greatest lower bound of 7:(p,z) and 
the numbers 7:(p,2,). If this is zero for some p > 0, we may extract a sub- 
sequence, {nx}, of the positive integers so that (p, 0 which contradicts 
Theorems 2 and 3. 


2. The existence of a generalized conformal representation of an arbitrary 
non-degenerate surface of finite Lebesgue area. In this section, we prove a 
selection theorem for a sequence of representations of non-degenerate surfaces 
which converge to a non-degenerate surface. This theorem together with its 
proof is the exact analog for the vector functions representing these surfaces 
of Lebesgue’s selection theorem for a sequence of monotone functions with 
uniformly bounded Dirichlet integrals which converges uniformly on the 
boundary of a region. By means of this theorem, the main result of the paper 
is established. The method of proof used extends to representations on an 
n-sphere of n-dimensional manifolds which are of class LZ with 


ye 


f [gu +° + gnn}"/? du'- - - du" M 


independent of n, the gi; being among the coefficients gi; of the fundamental 
(positive definite) form 
ds? == >» > gijdu‘dwi. 
4=1 j=1 

Definition 1. <A function, f(z, y), will be said to be absolutely continuous 
in the sense of Tonellr?® (A.C. T.) in a region f if it is A.C. T. on every 
rectangle interior to r with fz and f, summable over r; f(z,y) is A. C. T.” 
foraS2=b,cSy=d if it is continuous there and 


1°, Tonelli, “Sulla quadratura delle superficie,’ Atti della Reale Accademia 
Nazionale dei Lincei, ser. 6, vol. 3 (1926), pp. 633-638. 

*7TIn a paper, “ Complements of potential theory II,” American Journal of Mathe- 
matics, vol. 55 (1933), pp. 42-46, G. C. Evans has shown that this concept is identical 
with that of a continuous potential function of its generalized derivatives (see ref. 19). 


en 


tal 


JUS 


117 


CHARACTERIZATION OF SURFACES OF FINITE LEBESGUE AREA. 697 


(i) for almost every X,a=X Sb, f(X,y) is absolutely continuous in 
y, and for almost every Y, cS Y Sd, f(z, Y) is absolutely continuous in 2; 


b d 
(ii) y)]dX and f Va*[ f(x, ¥)]a¥ both exist, where 


V.”4{f(X, y)], for instance denotes the variation of f(-Y,y) on (c,d) con- 
sidered as a function of y alone. 
It is known that f, and fy exist almost everywhere and are summable. 
The following definitions and lemmas may be found in the literature and 
are included here merely for the sake of completeness, 


Lemma Jf {fn(z,y)} 1s a sequence of functions, A.C.T. on #, 
which approach the continuous function, f(x,y), uniformly and there exist 
constants M, p > 1, and q > 1, independent of n such that 


ff [| |? + | Ofn/dy |“]dx dy < M, 


r 


then f(x,y) is A.C.T. on #, | fe |? and | fy |4 are summable on r, and 


ff | fe |? da dy Slim f f | Ofn/dx |? de dy; 
ff fy jae ay Stim f f | Ofn/dy |4 da dy. 


2.'° Let f(z,y) be A.C. T. in and let x = t), y= y(s, t) 
be a 1—1 transformation of 7, into R, where x(s,t) and y(s,t) are con- 
tinuous together with their first partial derivatives and | | = A> 0. 
Then if (s,t) = f[x(s, t), y(s, t)], we have that $(s, t) is A.C. T. in R, and 


(2.1) bs = fats + he = forte + 


almost everywhere. 


LemMa 3. Suppose (i) f(x,y) is A.C. T. in # with f,? and f,? summable 
over r, (11) 2(s,t), y=y(s,t) is a 1—1 conformal transformation 
(merely continuous on r*) of # into R, (iii) $(s,t) =f[x(s, t), y(s, t)]. 
Then (a) (s,t) is A.C. T. on R, (b) its partial derivatives are given almost 
everywhere by the formulas (2.1), (c) os? and ¢:? are summable over R, 
and (d) we have 
**C. B. Morrey, Jr., loc. cit. (R). 


*’ G. C. Evans, “ Fundamental points of potential theory,” Rice Institute Pamphlets, 
vol. 7, no. 4 (1920), pp. 274-285, particularly. 


an 

| 

| 

i 

nd : 

ts 

a 

its 

es 

he | 

er 

| 

an r 

ry 

nia 
he- 
cal 

) | 

4 


CHARLES B. MORREY, JR. 


(fa? + fy?) de dy. 


Proof. Lemma 3 is an immediate consequence of the preceding lemma as 
is easily seen by considering the mapping of regions entirely interior to R 
on regions entirely interior to r by the given transformation. 


Definition 2.274 A representation, 2(u), wef, of a surface, 9, is 
said to be of class L if 


(i) the components, z‘(u,v), are all A.C. T. on #7, 


Wn ff 


v) v) 


dudv=0, 


(1,j 
uth uth 


Tq being the set of points of r at a distance = a from r* (i.e., this is true 
for all these «). 


Lemma 4.2%21 4 convenient subclass of representation of class L 1s 
determined by the following conditions: 


(i) is A.C.T., (i=1,---,N), 
(ii) | |?, | are summable over r, p, g2=1, 1/p+1/q S11, 


We include the case where one of p and q is unity and the other infinite by 
interpreting (ii), in the case where p= 1, g = 0, for instance, to mean 


(ii’) | tyt | < M, | | summable in r, (t==1,---,N). 
Surfaces z= f(z, y) with f(z, y) A.C. T. are also seen to be of class L. 


Lemma Jf the representation, 2(u), of the surface S, is of 
class L, L(8) is gwen by the usual integral formula. 


Definition 3.2 The representation, (wu), of the surface § is gen- 
eralized conformal if it satisfies conditions (i) and (ii) of Lemma 4 with 


20°C. B. Morrey, Jr., loc. cit. (R). 
#1 E. J. McShane, loc. cit, (2nd ref., footnote 11). 
72 C. B. Morrey, Jr., loc. cit. (R). 

#8 E. J. McShane, loc. cit. (2nd ref.). 


698 
| 


is 


CHARACTERIZATION OF SURFACES OF FINITE LEBESGUE AREA. 699 


p=q=2 and F=0 almost everywhere, F, G being given by 
their usual formulas. 


Definition 4. Wesay that the points P; and P, of a surface S, S:t—=2(u), 
wef, are logically distinct if they correspond to distinct values u, and uz in #, 
such that ~(w) is not constant over any continuum containing them both. This 
property is clearly invariant under changes of parameter, u. If S ** and z(u) 
are non-degenerate, the above merely requires that wu, Uo. 


LemMA 6.7° Let II be a non-degenerate polyhedron. It possesses a 
generalized conformal representation on the unit circle in which three given 
logically distinct points on the boundary of II correspond to three given dis- 
tinct points on the boundary of the unit circle. If Il is degenerate, the mapping 
is impossible. 


Lemma 7.7° Let S, 8:2=2(u), and Sn, (n—1,2,°°°), 
be continuous surfaces. Suppose (i) the given representations of the Sy are 
generalized conformal, (ii) the functions (wu) converge uniformly to x(u), 
and (iii) lim L(S,) —=L(S). Then the given representation of 8 is gen- 
eralized conformal. 


THEOREM 1. Given that : c=—2z(u), wef, and Sn, Sn: 


wef, (n=1,2,- - -), are non-degenerate surfaces, that lim S, = 8S, that x(u) 


and are all non-degenerate, and that tn(u) approaches x(u) uniformly. 
Suppose t= X,(U), Ue R, is a representation of Sn satisfying: (i) tt ts of 
class L; (ii) one of the induced (T, § 4, Theorem 3, and Def. 8) continuous 
monotone transformations, u=Un(U), of R into # carries three fixed (in- 
dependently of n) distinct points, A, B, and C, of R* into three fixed distinct 
points, a, b, and c, respectively, of r*; (iii) there is a constant M, independent 
of n, such that 


SJ 


Then the X,(U) are equicontinuous on R. 
If the X,(U) are not normalized on the boundary (1. e., do not satisfy 
(ii), they are equicontinuous on any closed set interior to R. 


0X,/0U |? + | 0X,/0V |?]dU dV < M, (n=1,2,--°). 


**A surface is said to be non-degenerate if it possesses a non-degenerate 
representation. 

*° See for instance, C. Caratheodory, “ Conformal representation,” Cambridge Tracts 
in Mathematics and Mathematical Physics, no. 28, § 161 and §§ 125-130 particularly. 

7° C. B. Morrey, Jr., “ A class of representations of manifolds (Part II),” American 
Journal of Mathematics, vol. 56, no. 2 (1934), pp. 275-293. 


iy 
i 
H 
| 
i 
| 
| 
8 
i 
| 
y 
| 
fi 


700 CHARLES B. MORREY, JR. 


In the normalized case, any limit function, X(u), will satisfy all three 
conditions, and in the second case any limit function (defined over all of R) 
will satisfy (i) and (iii) on every closed region interior to R. 


Proof. It is clear (Theorem 4, § 1) that we may take 7 to be the unit 
circle, and a, b, and c to be equally spaced. On account of Lemma 3, we may 
also take R& to be the unit circle and A, B, and C to be equally spaced.?’ It is 
clearly sufficient to show that the functions u,(U) are equicontinuous. 

We wish to observe at the outset that if P*, and P*, are points of R* 
on a closed large (small) are bounded by two fixed points and containing 


(not containing) the third, and we choose P*,P*, as that arc bounded by 
P*, and P*, and lying in the above large (small) arc, then all the points of the 


arc Pr, are carried into the corresponding arc p*,p*., p*; = Tn(P*i), 
(11,2). This follows from the normalization of the 7, and the nature 
of the continua of #* which are carried into points of r* by a continuous 
monotone transformation.** Thus | is equal to the os- 
cillation of u,(U) on an are P*,P*, which contains at most one of the fixed 


points A, B, C, unless this oscillation is equal to 2 in which case the above 
expression is not less than 3”. 

Let C(Po,p) denote the circle with any center at Py, and radius p, 
34/2 = d>p> 0. Suppose that the oscillation of some u,(U) in 
O(Po, po) 'RZ«, 22e>0,d>p.>0. Then, from T, § 3, Theorem 1,” 
it is clear that the oscillation of un(U) on [C(Po,p)-R]* Ze, d> 
Define C(p) to be the are of [C(Po, p)|* which lies in R, and let P*,) and 


P*.5 be its end points, if they exist, in which case we let P*P*s, be the 
are [C(Po,p)-R]*-R*. Then it is clear that the oscillation of u»n(U) over 
C(p) 2 d > p= po, for (i) if the oscillation over P*,»P*.» (which may 
be null or a point) = «/2, this is obviously the case, and (ii) if the oscillation 


over > ¢/2, then the oscillation over C(p) = | un(P*:p) —Un(P*2p)| 
which is = the smaller of the numbers 3% and the oscillation of w»(U) over 


P*,pP*o» (since P*;pP*) obviously cannot contain more than one of the fixed 
points), both of which exceed ¢/2, since « < 2. 

Now, by Theorem 5, § 1, we can find an 7,(€), positive with e, such that, 


*7 The argument can be carried through if R is any (Jordan) region, however. 

7° Or it follows directly from the theorem (T. §3, Theorem 2) that a monotone 
transformation is the uniform limit (in the sense that the vector functions approach 
their limit uniformly) of a sequence of 1—1 continuous transformations, the state- 
ment being obvious for these. 


CHARACTERIZATION OF SURFACES OF FINITE LEBESGUE AREA. 701 


for each « > 0, 0<m(e) m(e2), 0< m(e m(€, In), (n =1, 
Hence, if we define 


we see that | —wn(U2)| when | Ui:—U2| < 8(e). For if this 
is not the case for some u,(U) and points U, and Us, the oscillation of un(U) 
in C(Po, kd) -R=e, where Up = (Ui + U2) /2. Then for every p, d > p= kd, : 
the oscillation of wna(U) on C(p) = e¢/2, and thus the oscillation of 
Xn(U) = = m(€/2) on C(p). Let us choose polar codrdinates 
with pole at P, (notice Lemma 2), and let @:p and O2p, 27 = O29 — Aip > 0 
(2/3 in fact), be the angular codrdinates of P*;) and P*2» respectively 
(chosen so that C(p) is the arc 0: =@= 62)) if they exist, otherwise let 
6p = 0, O29 = 27. Then using Schwartz’s inequality 


aX, {2 dp » |? 
=> 
u> | aU | dUdV dé 


> 1 OXn | | = log 4 =" 
p 92p— LJ a, 


kd 


which is impossible. 
If we choose rp <1 and d=1—vr», the above argument demonstrates 


the equicontinuity of the u»n(U) in the closed circle U? + V? Sr,? in- 
dependently of the normalization on U?-+ V?=—1. This demonstrates the 
second statement in the conclusion of the theorem. The third statement 


follows immediately from Lemma 1. 


THEOREM 2. A necessary and sufficient condition that a non-degenerate 
surface, S, be of finite Lebesgue area is that it possess a generalized conformal 
(normalized) representation on the closed unit circle. 


Proof. The sufficiency of the condition is immediate from Lemmas 4 


and 5. 
To prove the existence of such a map, let {IIn} be a sequence of polyhedra 


approaching S, where lim L(IIn) = L(S), and  =2(u) be a non-degenerate 


representation of S on 7. It is clear that we may replace each II, by a non- 


degenerate polyhedron, In, such that | — < 1/n and 
| Hn, In || <1/n by merely moving the vertices *° of II, slightly. Then, let 


** By definition (given for instance in C. B. Morrey, Jr., loc. cit. R) il, can be 
represented on Q (the unit square) by a function Z-(w) which is linear im triangles 
(a finite number of them). The vertices of il, are merely the points corresponding to 


the vertices of the triangles in Q. 


| 


702 CHARLES B. MORREY, JR. 


&==2Z,(u) be a sequence of non-degenerate representations of II, such that 
tn(u) approaches 2(w) uniformly (that this is possible follows from T, 
§ 5, Theorem 2). ‘Then, let a,b, and c be three distinct points of r*, and 
A, B, and C. three distinct points of R*, where # is the closed unit circle, 
Let c= X,(U), Ue R, be a generalized conformal representation of II, on & 
so that an induced transformation, u = wu»(U) of R into 7, carries A, B, and 
C into a, b, and c respectively. By Lemma 5, and the conformality, 


L (In) — (1/2) ff (a: |? + | |?]dU av. 


Thus, the hypotheses of Theorem 1 are fulfilled and thus we may extract a 
subsequence of the X,(U) which converges uniformly to a function X(U), 
Clearly = X(U) is a representation of S*° and, by Lemma 7, it is gen- 
eralized conformal. 

The following very important theorem due to McShane * and used by 
him in his very interesting solution of the problem of Plateau is a consequence 
of the above theorem and the theorem of T, § 5, Theorem 5. 


THEOREM 3. Let 8 be a Lebesgue monotone (T, § 5, Def. 4) surface of 
finite area bounded by a Jordan curve. Then S possesses a generalized con- 
formal representation on the unit circle in which three given distinct points 
on the boundary of 8 correspond to three given distinct points on the circum- 
ference of the unit circle. 


THE UNIVERSITY OF CALIFORNIA. 


8° For let 8, be this surface. many, 2 pen | 8, 8,, || = lim || 8, 8,, |i =0. But since 4 4 


the Fréchet distance satisfies the “ triangle inequality,” 8, 8,8, +1 Sy 4 
we see that || 8,8, || =0 and thus S=4,. ® 
31K. J. loc. cit. (1st ref.). 


} 

| 


q ‘ 
1 


